[mod_python] RE: Bug in mod_python.c causing multiple/redundantinterpreter creation.

Nicolas Lehuen nicolas at lehuen.com
Sat Oct 16 17:56:40 EDT 2004


Just as a note, I did try to use my fix to apache.py without your fix to
mod_python.c, and I do get multiple reloadings of the publisher handler due
to the creation of multiple Python interpreter. So the two fixes are both
required. Thanks again Graham !

BTW, Grisha, I now have two patches to apply to mod_python to make it usable
in my production environment :

1) My patch to requestobject.c to make it garbage-collection friendly
2) Graham's patch to mod_python.c and my patch to apache.py to make it
multi-thread safe.

I have built a private subversion repository to track the official
mod_python CVS on the one side, and those patches I can't work without on
the other side. Would it be possible to integrate those patches in the CVS
instead ? Please, pretty please ?

Regards,
Nicolas 

> -----Message d'origine-----
> De : Nicolas Lehuen [mailto:nicolas at lehuen.com] 
> Envoyé : samedi 16 octobre 2004 16:47
> À : 'Nicolas Lehuen'; 'Graham Dumpleton'
> Cc : 'mod_python at modpython.org'
> Objet : RE: [mod_python] RE: Bug in mod_python.c causing 
> multiple/redundantinterpreter creation.
> 
> Graham, I recompiled mod_python with your fixes, but 
> unfortunately I still have the problem. Let me explain you 
> how I get it.
> 
> I have a pretty simple debug page published by 
> mod_python.publisher. I use ApacheBenchmark to have 50 
> concurrent connections to my Apache server just after the 
> Apache server is started, as follows :
> 
> c:\apache\bin\ab.exe -c 50 -n 1000 http://localhost/python/index.py
> 
> What I get in the error log is joined to this message (see 
> error.1.log).
> 
> If I don't have concurrent connections, mod_python behaves as 
> expected (see error.2.log) :
> 
> c:\apache\bin\ab.exe -c 1 -n 1000 http://localhost/python/index.py
> 
> Note the exception in error.1.log : mod_python.publisher is 
> reloaded many times. I think that your patch may solve a 
> problem with mod_python creating multiple interpreters with 
> the same name, but there is stil a problem in the 
> mod_python.apache module, since it reloads the same publisher 
> many times.
> 
> Indeed, having a look at apache.py, I didn't see any locking 
> done to prevent the same module from being reloaded multiple 
> times concurrently. What we need is to put a lock in the 
> import_module method, so that two threads cannot check the 
> freshness of the publisher simultaneously. This can be done 
> in a rather brutal way by calling imp.acquire_lock() at the 
> beginning of import_module, put a try/finally block around 
> the code of the function and call imp.release_lock() in the 
> finally block : 
> 
> def import_module(module_name, autoreload=1, log=0, path=None):
>     """
>     Get the module to handle the request. If
>     autoreload is on, then the module will be reloaded
>     if it has changed since the last import.
>     """
> 
>     imp.acquire_lock()
>     try:
> 	# previous code of apache.import_module
>     finally:
>         imp.release_lock()
> 
> I tested this, and it works as expected. BTW, maybe it 
> wouldn't work without your patch, Graham, since I could 
> suddenly see two interpreters with their own copy of the 
> mod_python.apache module. But the two fixes are required for 
> it to work properly.
> 
> Why did I wrote "in a rather brutal way" above ? Because this 
> locking scheme is indeed brutal. It is a global lock, so each 
> and every thread who want to get its handler must acquire it 
> for a brief amount of time. The problem is that if a handler 
> module is reloaded, it locks each and every other thread 
> during its reloading, even if they should not be handled by 
> the same module. A smarter way to lock would be to use a 
> two-levels locking scheme, as I described in my caching 
> recipe in the Python Cookbook. I can implement it there if 
> it's ok for everybody.
> 
> Regards,
> Nicolas
> 
> > -----Message d'origine-----
> > De : mod_python-bounces at modpython.org 
> > [mailto:mod_python-bounces at modpython.org] De la part de 
> Nicolas Lehuen 
> > Envoyé : samedi 16 octobre 2004 15:41 À : 'Graham Dumpleton'
> > Cc : mod_python at modpython.org
> > Objet : [mod_python] RE: Bug in mod_python.c causing 
> > multiple/redundantinterpreter creation.
> > 
> > Thanks Graham, that's great news that you managed to find 
> the issue. 
> > I'll see if I can make a build for Win32 with your patch 
> and tell you 
> > if it works for me.
> > 
> > Regards,
> > 
> > Nicolas
> > 
> > > -----Message d'origine-----
> > > De : Graham Dumpleton [mailto:grahamd at dscpl.com.au] Envoyé
> > : samedi 16
> > > octobre 2004 15:05 À : Nicolas Lehuen Cc : 'Daniel Popowich'; 
> > > mod_python at modpython.org Objet : Bug in mod_python.c causing 
> > > multiple/redundant interpreter creation.
> > > 
> > > 
> > > On 09/10/2004, at 6:10 PM, Nicolas Lehuen wrote:
> > > 
> > > > Hi Graham, what are the pros and cons of using the imp
> > > module versus
> > > > an execfile call or an exec call ?
> > > >
> > > > Using Apache 2.0.52 on win32, with a multithreaded MPM, I
> > > have noticed
> > > > a few problem with mod_python.publisher which reloads the
> > > same module
> > > > many times when many threads try to load it concurrently. 
> > It seems
> > > > that either mod_python.publisher or the imp module does not
> > > properly
> > > > handle this situation. The result is that I have many
> > > instances of the
> > > > same module in memory, so I got many connection pools
> > > instead of only
> > > > one and so on.
> > > > That's
> > > > why I decided to fix this and other problems (the famous
> > 'only one
> > > > index.py in your application') by reimplementing my own
> > > variant of the
> > > > publisher.
> > > 
> > > Found it.
> > > 
> > >  From what I can tell, there is actually a bug in 
> mod_python.c which 
> > > is causing the problem you describe as "reloads the same 
> module many 
> > > times when many threads try to load it concurrently".
> > > 
> > > I am going to describe it quickly as it is getting late for
> > me and I
> > > need
> > > some sleep. Have attached a patch for people to evaluate and try. 
> > > Note that the patch also includes a Mac OS X specific 
> fix. That fix 
> > > is harmless in itself so just leave it in. The important 
> changes are 
> > > where APR mutex is created and then locked/unlocked.
> > > 
> > > The basic problem revolves around the Python dictionary 
> used to hold 
> > > the set of interpreters. The code in mod_python.c is 
> trying to use 
> > > the Python GIL to provide exclusive access to that dictionary and 
> > > any
> > subsequent
> > > creation of an interpreter.
> > > 
> > > The only catch is that in creating a new interpreter, the Python 
> > > core is, in someway I don't understand, swapping thread states at
> > some point
> > > which
> > > is allowing other threads to acquire the GIL. This is
> > allowing other
> > > threads
> > > to check the set of interpreters for the same interpreter that is 
> > > currently being constructed and since it isn't there yet, 
> then also
> > goes onto
> > > create
> > > it as well.
> > > 
> > > The result is you get many threads in the same process 
> creating the 
> > > same named interpreter and loading the top level handler 
> defined by 
> > > the directive PythonHandler (or other earlier handler). 
> Thus you see 
> > > all
> > the logged
> > > messages about (re)importing the handler, as each thread 
> has managed 
> > > to get to the point of doing it.
> > > 
> > > What should have happened is that if the lock was truly
> > exclusive, the
> > > subsequent threads would have waited until the first had actually 
> > > finished creating the interpreter in the first place.
> > > 
> > > The patch for the problem is below. What I do is create 
> an APR mutex 
> > > and lock that around the outside of the acquisition of the GIL,
> > the GIL
> > > still
> > > being required.
> > > 
> > > In all the debugging I have in place in my code, I still note 
> > > something else not quite right in respect of the mod_python.apache
> > module being
> > > reloaded more than once by a thread. I'll dig further into that 
> > > tomorrow.
> > > 
> > > 
> > > 
> > > *** mod_python.c.dist	Fri Sep 24 15:11:32 2004
> > > --- mod_python.c	Sat Oct 16 22:36:30 2004
> > > ***************
> > > *** 31,36 ****
> > > --- 31,38 ----
> > >     * (In a Python dictionary) */
> > >    static PyObject * interpreters = NULL;
> > > 
> > > + static apr_thread_mutex_t* interpreters_lock = 0;
> > > +
> > >    apr_pool_t *child_init_pool = NULL;
> > > 
> > >    /**
> > > ***************
> > > *** 124,129 ****
> > > --- 126,133 ----
> > >            name = MAIN_INTERPRETER;
> > > 
> > >    #ifdef WITH_THREAD
> > > +     apr_thread_mutex_lock(interpreters_lock);
> > > +
> > >        PyEval_AcquireLock();
> > >    #endif
> > > 
> > > ***************
> > > *** 149,154 ****
> > > --- 153,160 ----
> > > 
> > >    #ifdef WITH_THREAD
> > >        PyEval_ReleaseLock();
> > > +
> > > +     apr_thread_mutex_unlock(interpreters_lock);
> > >    #endif
> > > 
> > >        if (! idata) {
> > > ***************
> > > *** 469,474 ****
> > > --- 475,483 ----
> > >        const char *userdata_key = "python_init";
> > >        apr_status_t rc;
> > > 
> > > +     /* fudge for Mac OS X with Apache 1.3 where
> > Py_IsInitialized()
> > > broke */
> > > +     static int initialized = 0;
> > > +
> > >        apr_pool_userdata_get(&data, userdata_key, 
> s->process->pool);
> > >        if (!data) {
> > >            apr_pool_userdata_set((const void *)1, userdata_key,
> > > ***************
> > > *** 490,502 ****
> > >        }
> > > 
> > >        /* initialize global Python interpreter if necessary */
> > > !     if (! Py_IsInitialized())
> > >        {
> > > 
> > >            /* initialze the interpreter */
> > >            Py_Initialize();
> > > 
> > >    #ifdef WITH_THREAD
> > >            /* create and acquire the interpreter lock */
> > >            PyEval_InitThreads();
> > >    #endif
> > > --- 499,514 ----
> > >        }
> > > 
> > >        /* initialize global Python interpreter if necessary */
> > > !     if (initialized == 0 || ! Py_IsInitialized())
> > >        {
> > > +         initialized = 1;
> > > 
> > >            /* initialze the interpreter */
> > >            Py_Initialize();
> > > 
> > >    #ifdef WITH_THREAD
> > > +          
> > > apr_thread_mutex_create(&interpreters_lock,APR_THREAD_MUTEX_UN
> > > NESTED,p);
> > > +
> > >            /* create and acquire the interpreter lock */
> > >            PyEval_InitThreads();
> > >    #endif
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Graham Dumpleton (grahamd at dscpl.com.au)
> > > 
> > > 
> > 
> > 
> > _______________________________________________
> > Mod_python mailing list
> > Mod_python at modpython.org
> > http://mailman.modpython.org/mailman/listinfo/mod_python
> > 




More information about the Mod_python mailing list