[mod_python] mod_python 3.3.1 memory leak?

yubing trueice at gmail.com
Tue Jul 3 08:37:15 EDT 2007


On 7/3/07, Graham Dumpleton <graham.dumpleton at gmail.com> wrote:
>
> On 03/07/07, yubing <trueice at gmail.com> wrote:
> > That's great of you:)
> > I've tried a php scriptlet that pumps file to user, it has the same
> memory
> > behavior as the python pumper.
> > The pooled allocated memory will only be released when the apache thread
> > exits (reaches the MaxRequestsPerChild).
>
> No. The pool here is associated with the particular request, not the
> whole process. So would be released and available for use at the end
> of the request. Nothing to do with MaxRequestsPerChild.


When the http client exits, the allocated pool memory of that process will
not be reclaimed, you can observe that on both python and php streamers:)
The only chance to return that memory to the system may be the time the
apache thread itself exits.

> However, our streaming code's memory usage(VmRSS) grows too fast in one
> > request (maybe 16k per second).
> > Is mod_python using such pooled allocation mechanisms during one request
> ?
>
> Not sure what to suggest at the moment bar having it on a different
> port or host and using a custom Python web server for just that one
> task.


Anyhow, it's clear that ap_rflush is the root cause of this memory leak,
maybe we should find a new API for this (maybe we should also add a new
method to the request_object ).

I can only presume that this problem exists because mod_python is
> using a high level API for writing response data. It is quite possible
> if writing a custom C handler that if one used the bucket API directly
> then one could control it better and ensure that bucket structures
> released straight away after data flushed.


agreed:)

Anyway, mod_wsgi suffers the same problem, so I might have to dig into
> the lower level bucket API for a solution so I can at least make
> mod_wsgi work properly. From that, might understand how to fix
> mod_python.


Will lighttpd be a better choice? I'll try it later:)

Sleep time for me now, but maybe I'll come up with some bright idea
> during the night. :-)
>
> Graham
>
> >  On 7/3/07, Graham Dumpleton <graham.dumpleton at gmail.com> wrote:
> > > All I can say is that is simply how Apache works. It is not a problem
> > > with mod_python. To try and explain it, look at simpler code of
> > > ap_rflush() which is one of two Apache functions called when you call
> > > req.write ().
> > >
> > > AP_DECLARE(int) ap_rflush(request_rec *r)
> > > {
> > >     conn_rec *c = r->connection;
> > >     apr_bucket_brigade *bb;
> > >     apr_bucket *b;
> > >
> > >     bb = apr_brigade_create(r->pool, c->bucket_alloc);
> > >     b = apr_bucket_flush_create(c->bucket_alloc);
> > >     APR_BRIGADE_INSERT_TAIL(bb, b);
> > >     if (ap_pass_brigade(r->output_filters, bb) !=
> > APR_SUCCESS)
> > >         return -1;
> > >
> > >     return 0;
> > > }
> > >
> > > The important thing here is r->pool. This is a memory pool which
> > > exists for the life of a request. When you allocate memory from the
> > > memory pool, it will only be reclaimed at the end of the request. Ie.,
> > > it isn't like malloc/free where you can give back a block of memory
> > > using free and it will be reusable straight away.
> > >
> > > Now, every time a flush is occurring, it needs to create a bucket
> > > brigade which holds a special flush object. This is then passed down
> > > through the output filters. In this case it causes any pending data to
> > > be flushed out.
> > >
> > > Now because the memory for those bucket structures is only reclaimed
> > > at the end of the request, it means if you have a very long running
> > > request which outputs data in small blocks, then there will be an
> > > incremental use of memory because of the need to create the bucket
> > > structures. This memory will be unavailable for use by anything else
> > > until the requests ends.
> > >
> > > FWIW, I didn't realise Apache did this either. I can see there are
> > > potentially good reasons for it being done this way, but still was a
> > > surprise.
> > >
> > > Graham
> > >
> > > On 03/07/07, yubing <trueice at gmail.com> wrote:
> > > > oh, sorry, that's my typo in the mail, it should have been:
> > > > ------------------------------
> > > > def pump_file(req):
> > > >     fp = open("/dev/zero", "r")
> > > >     while(True):
> > > >         buf = fp.read(4096)
> > > >         try:
> > > >             req.write(buf)
> > > >             time.sleep(0.1)
> > > >         except:
> > > >             fp.close()
> > > >             break
> > > > -------------------------------
> > > > you can still observe the memory going up slowly
> > > > BTW: I'm using the prefork mpm of apache
> > > >
> > > > On 7/3/07, Graham Dumpleton <graham.dumpleton at gmail.com > wrote:
> > > > > On 03/07/07, yubing < trueice at gmail.com > wrote:
> > > > > > Our project has a live HTTP video streamer written in python,
> which
> > > > keeps
> > > > > > pumping a stream out to the client.
> > > > > > The HTTP serving module is a simple mod_python request handler
> > running
> > > > on
> > > > > > Apache 2.2.4 with mod_python 3.3.1 (Python 2.5.1).
> > > > > > The stream is read out of our streaming server via TCP socket,
> and
> > the
> > > > > > python script just do some simple processing like header
> building,
> > each
> > > > > > allocated buffer is del-ed after being used.
> > > > > >
> > > > > > The problem is:
> > > > > > We observed that after its running serveral hours, its memory
> > occupation
> > > > > > grows up to serveral hundreds of megabytes and keeps growing in
> > 4k-8k
> > > > > > increment every 1-2 seconds.
> > > > > >
> > > > > > Below is a simple testing scriptlet, the memory leaking issue is
> not
> > so
> > > > > > serious as our live serving module, but you can still observe 4k
> > memory
> > > > > > growing every serveral seconds.
> > > > > >
> > > > > > Could anyone help me to figure out the root cause of this issue?
> > > > > >
> > > > > > --------------------------
> > > > > > import time
> > > > > >
> > > > > > def pump_file(req):
> > > > > >     while(True):
> > > > > >         fp = open("/dev/zero", "r")
> > > > > >         buf = fp.read(4096)
> > > > > >         try:
> > > > > >             req.write(buf)
> > > > > >             del buf
> > > > > >             time.sleep(0.1)
> > > > > >         except:
> > > > > >             fp.close()
> > > > > >             break
> > > > >
> > > > > BTW, you do realise that you open the file in the loop when it
> should
> > > > > be outside.
> > > > >
> > > > > Graham
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > truly yours
> > > > ice
> > >
> >
> >
> >
> > --
> > truly yours
> > ice
>



-- 
truly yours
ice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20070703/a80e32e9/attachment.html


More information about the Mod_python mailing list