Graham Dumpleton
graham.dumpleton at gmail.com
Sat Jun 23 07:24:28 EDT 2007
On 22/06/07, Robin Haswell <me at robhaswell.co.uk> wrote: > Hi there > > I'm having issues writing an output filter that can handle large files. > My filter's primary purpose is to count the length of the output. Could > someone assist me please? > > The code I have works very well except when a large file is passed > through it, at which point it eats up memory, probably equal to the size > of the file. > > Any assistance would be much appreciated. > > Here is my filter so far: > > def outputfilter(filter): > > s = filter.read(4096) > bytes = 0 > while s: > filter.write(str(s)) > bytes += len(s) > s = filter.read(4096) > > if s is None: > filter.close() One generally would not provide a size argument to filter.read(), instead one would let Apache provide the data in the natural size that was written out by a handler or as read in from a file. By doing what you are doing you would be forcing Apache to break up data or have to accumulate it so as to give it in the size you want. The result of that would be an increase in the amount of memory which would be used as you are seeing. Also note that your output filter can be called more than once, so you have to store your byte count in the request object to preserve it between calls. Thus: def outputfilter1(filter): if not hasattr(filter.req, 'mybytecount'): filter.req.mybytecount = 0 s = filter.read() while s: filter.req.mybytecount += len(s) filter.write(s) s = filter.read() if s is None: filter.close() # ... do something with byte count return apache.OK BTW, why are you wanting to do this? The request object already has a bytes_sent attribute which Apache updates as data is written and will when all is done have the final content length in it. What you want to do with the value will determine the mechanism used to register a handler which will give you access to it after everything is done. Graham
|