[mod_python] RE: Bug in mod_python.c causing multiple/redundantinterpreter creation.

Nicolas Lehuen nicolas at lehuen.com
Sat Oct 16 17:46:31 EDT 2004


Graham, I recompiled mod_python with your fixes, but unfortunately I still
have the problem. Let me explain you how I get it.

I have a pretty simple debug page published by mod_python.publisher. I use
ApacheBenchmark to have 50 concurrent connections to my Apache server just
after the Apache server is started, as follows :

c:\apache\bin\ab.exe -c 50 -n 1000 http://localhost/python/index.py

What I get in the error log is joined to this message (see error.1.log).

If I don't have concurrent connections, mod_python behaves as expected (see
error.2.log) :

c:\apache\bin\ab.exe -c 1 -n 1000 http://localhost/python/index.py

Note the exception in error.1.log : mod_python.publisher is reloaded many
times. I think that your patch may solve a problem with mod_python creating
multiple interpreters with the same name, but there is stil a problem in the
mod_python.apache module, since it reloads the same publisher many times.

Indeed, having a look at apache.py, I didn't see any locking done to prevent
the same module from being reloaded multiple times concurrently. What we
need is to put a lock in the import_module method, so that two threads
cannot check the freshness of the publisher simultaneously. This can be done
in a rather brutal way by calling imp.acquire_lock() at the beginning of
import_module, put a try/finally block around the code of the function and
call imp.release_lock() in the finally block : 

def import_module(module_name, autoreload=1, log=0, path=None):
    """
    Get the module to handle the request. If
    autoreload is on, then the module will be reloaded
    if it has changed since the last import.
    """

    imp.acquire_lock()
    try:
	# previous code of apache.import_module
    finally:
        imp.release_lock()

I tested this, and it works as expected. BTW, maybe it wouldn't work without
your patch, Graham, since I could suddenly see two interpreters with their
own copy of the mod_python.apache module. But the two fixes are required for
it to work properly.

Why did I wrote "in a rather brutal way" above ? Because this locking scheme
is indeed brutal. It is a global lock, so each and every thread who want to
get its handler must acquire it for a brief amount of time. The problem is
that if a handler module is reloaded, it locks each and every other thread
during its reloading, even if they should not be handled by the same module.
A smarter way to lock would be to use a two-levels locking scheme, as I
described in my caching recipe in the Python Cookbook. I can implement it
there if it's ok for everybody.

Regards,
Nicolas

> -----Message d'origine-----
> De : mod_python-bounces at modpython.org 
> [mailto:mod_python-bounces at modpython.org] De la part de Nicolas Lehuen
> Envoyé : samedi 16 octobre 2004 15:41
> À : 'Graham Dumpleton'
> Cc : mod_python at modpython.org
> Objet : [mod_python] RE: Bug in mod_python.c causing 
> multiple/redundantinterpreter creation.
> 
> Thanks Graham, that's great news that you managed to find the 
> issue. I'll see if I can make a build for Win32 with your 
> patch and tell you if it works for me.
> 
> Regards,
> 
> Nicolas 
> 
> > -----Message d'origine-----
> > De : Graham Dumpleton [mailto:grahamd at dscpl.com.au] Envoyé 
> : samedi 16 
> > octobre 2004 15:05 À : Nicolas Lehuen Cc : 'Daniel Popowich'; 
> > mod_python at modpython.org Objet : Bug in mod_python.c causing 
> > multiple/redundant interpreter creation.
> > 
> > 
> > On 09/10/2004, at 6:10 PM, Nicolas Lehuen wrote:
> > 
> > > Hi Graham, what are the pros and cons of using the imp 
> > module versus 
> > > an execfile call or an exec call ?
> > >
> > > Using Apache 2.0.52 on win32, with a multithreaded MPM, I 
> > have noticed 
> > > a few problem with mod_python.publisher which reloads the 
> > same module 
> > > many times when many threads try to load it concurrently. 
> It seems 
> > > that either mod_python.publisher or the imp module does not 
> > properly 
> > > handle this situation. The result is that I have many 
> > instances of the 
> > > same module in memory, so I got many connection pools 
> > instead of only 
> > > one and so on.
> > > That's
> > > why I decided to fix this and other problems (the famous 
> 'only one 
> > > index.py in your application') by reimplementing my own 
> > variant of the 
> > > publisher.
> > 
> > Found it.
> > 
> >  From what I can tell, there is actually a bug in 
> > mod_python.c which is
> > causing the problem you describe as "reloads the same module 
> > many times
> > when many threads try to load it concurrently".
> > 
> > I am going to describe it quickly as it is getting late for 
> me and I  
> > need
> > some sleep. Have attached a patch for people to evaluate and 
> > try. Note  
> > that
> > the patch also includes a Mac OS X specific fix. That fix is 
> > harmless in
> > itself so just leave it in. The important changes are where 
> > APR mutex is
> > created and then locked/unlocked.
> > 
> > The basic problem revolves around the Python dictionary used 
> > to hold the
> > set of interpreters. The code in mod_python.c is trying to use the  
> > Python
> > GIL to provide exclusive access to that dictionary and any 
> subsequent
> > creation of an interpreter.
> > 
> > The only catch is that in creating a new interpreter, the 
> > Python core  
> > is,
> > in someway I don't understand, swapping thread states at 
> some point  
> > which
> > is allowing other threads to acquire the GIL. This is 
> allowing other  
> > threads
> > to check the set of interpreters for the same interpreter that is  
> > currently
> > being constructed and since it isn't there yet, then also 
> goes onto  
> > create
> > it as well.
> > 
> > The result is you get many threads in the same process 
> > creating the same
> > named interpreter and loading the top level handler defined by the  
> > directive
> > PythonHandler (or other earlier handler). Thus you see all 
> the logged
> > messages about (re)importing the handler, as each thread has 
> > managed to
> > get to the point of doing it.
> > 
> > What should have happened is that if the lock was truly 
> exclusive, the
> > subsequent threads would have waited until the first had actually  
> > finished
> > creating the interpreter in the first place.
> > 
> > The patch for the problem is below. What I do is create an 
> > APR mutex and
> > lock that around the outside of the acquisition of the GIL, 
> the GIL  
> > still
> > being required.
> > 
> > In all the debugging I have in place in my code, I still note 
> > something
> > else not quite right in respect of the mod_python.apache 
> module being
> > reloaded more than once by a thread. I'll dig further into that  
> > tomorrow.
> > 
> > 
> > 
> > *** mod_python.c.dist	Fri Sep 24 15:11:32 2004
> > --- mod_python.c	Sat Oct 16 22:36:30 2004
> > ***************
> > *** 31,36 ****
> > --- 31,38 ----
> >     * (In a Python dictionary) */
> >    static PyObject * interpreters = NULL;
> > 
> > + static apr_thread_mutex_t* interpreters_lock = 0;
> > +
> >    apr_pool_t *child_init_pool = NULL;
> > 
> >    /**
> > ***************
> > *** 124,129 ****
> > --- 126,133 ----
> >            name = MAIN_INTERPRETER;
> > 
> >    #ifdef WITH_THREAD
> > +     apr_thread_mutex_lock(interpreters_lock);
> > +
> >        PyEval_AcquireLock();
> >    #endif
> > 
> > ***************
> > *** 149,154 ****
> > --- 153,160 ----
> > 
> >    #ifdef WITH_THREAD
> >        PyEval_ReleaseLock();
> > +
> > +     apr_thread_mutex_unlock(interpreters_lock);
> >    #endif
> > 
> >        if (! idata) {
> > ***************
> > *** 469,474 ****
> > --- 475,483 ----
> >        const char *userdata_key = "python_init";
> >        apr_status_t rc;
> > 
> > +     /* fudge for Mac OS X with Apache 1.3 where 
> Py_IsInitialized()  
> > broke */
> > +     static int initialized = 0;
> > +
> >        apr_pool_userdata_get(&data, userdata_key, s->process->pool);
> >        if (!data) {
> >            apr_pool_userdata_set((const void *)1, userdata_key,
> > ***************
> > *** 490,502 ****
> >        }
> > 
> >        /* initialize global Python interpreter if necessary */
> > !     if (! Py_IsInitialized())
> >        {
> > 
> >            /* initialze the interpreter */
> >            Py_Initialize();
> > 
> >    #ifdef WITH_THREAD
> >            /* create and acquire the interpreter lock */
> >            PyEval_InitThreads();
> >    #endif
> > --- 499,514 ----
> >        }
> > 
> >        /* initialize global Python interpreter if necessary */
> > !     if (initialized == 0 || ! Py_IsInitialized())
> >        {
> > +         initialized = 1;
> > 
> >            /* initialze the interpreter */
> >            Py_Initialize();
> > 
> >    #ifdef WITH_THREAD
> > +          
> > apr_thread_mutex_create(&interpreters_lock,APR_THREAD_MUTEX_UN
> > NESTED,p);
> > +
> >            /* create and acquire the interpreter lock */
> >            PyEval_InitThreads();
> >    #endif
> > 
> > 
> > 
> > 
> > 
> > --
> > Graham Dumpleton (grahamd at dscpl.com.au)
> > 
> > 
> 
> 
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: error.2.log
Type: application/octet-stream
Size: 986 bytes
Desc: not available
Url : http://modpython.org/pipermail/mod_python/attachments/20041016/6f0a27b6/error.2-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: error.1.log
Type: application/octet-stream
Size: 13154 bytes
Desc: not available
Url : http://modpython.org/pipermail/mod_python/attachments/20041016/6f0a27b6/error.1-0001.obj


More information about the Mod_python mailing list