Nicolas Lehuen
nicolas at lehuen.com
Sat Oct 16 17:56:40 EDT 2004
Just as a note, I did try to use my fix to apache.py without your fix to mod_python.c, and I do get multiple reloadings of the publisher handler due to the creation of multiple Python interpreter. So the two fixes are both required. Thanks again Graham ! BTW, Grisha, I now have two patches to apply to mod_python to make it usable in my production environment : 1) My patch to requestobject.c to make it garbage-collection friendly 2) Graham's patch to mod_python.c and my patch to apache.py to make it multi-thread safe. I have built a private subversion repository to track the official mod_python CVS on the one side, and those patches I can't work without on the other side. Would it be possible to integrate those patches in the CVS instead ? Please, pretty please ? Regards, Nicolas > -----Message d'origine----- > De : Nicolas Lehuen [mailto:nicolas at lehuen.com] > Envoyé : samedi 16 octobre 2004 16:47 > À : 'Nicolas Lehuen'; 'Graham Dumpleton' > Cc : 'mod_python at modpython.org' > Objet : RE: [mod_python] RE: Bug in mod_python.c causing > multiple/redundantinterpreter creation. > > Graham, I recompiled mod_python with your fixes, but > unfortunately I still have the problem. Let me explain you > how I get it. > > I have a pretty simple debug page published by > mod_python.publisher. I use ApacheBenchmark to have 50 > concurrent connections to my Apache server just after the > Apache server is started, as follows : > > c:\apache\bin\ab.exe -c 50 -n 1000 http://localhost/python/index.py > > What I get in the error log is joined to this message (see > error.1.log). > > If I don't have concurrent connections, mod_python behaves as > expected (see error.2.log) : > > c:\apache\bin\ab.exe -c 1 -n 1000 http://localhost/python/index.py > > Note the exception in error.1.log : mod_python.publisher is > reloaded many times. I think that your patch may solve a > problem with mod_python creating multiple interpreters with > the same name, but there is stil a problem in the > mod_python.apache module, since it reloads the same publisher > many times. > > Indeed, having a look at apache.py, I didn't see any locking > done to prevent the same module from being reloaded multiple > times concurrently. What we need is to put a lock in the > import_module method, so that two threads cannot check the > freshness of the publisher simultaneously. This can be done > in a rather brutal way by calling imp.acquire_lock() at the > beginning of import_module, put a try/finally block around > the code of the function and call imp.release_lock() in the > finally block : > > def import_module(module_name, autoreload=1, log=0, path=None): > """ > Get the module to handle the request. If > autoreload is on, then the module will be reloaded > if it has changed since the last import. > """ > > imp.acquire_lock() > try: > # previous code of apache.import_module > finally: > imp.release_lock() > > I tested this, and it works as expected. BTW, maybe it > wouldn't work without your patch, Graham, since I could > suddenly see two interpreters with their own copy of the > mod_python.apache module. But the two fixes are required for > it to work properly. > > Why did I wrote "in a rather brutal way" above ? Because this > locking scheme is indeed brutal. It is a global lock, so each > and every thread who want to get its handler must acquire it > for a brief amount of time. The problem is that if a handler > module is reloaded, it locks each and every other thread > during its reloading, even if they should not be handled by > the same module. A smarter way to lock would be to use a > two-levels locking scheme, as I described in my caching > recipe in the Python Cookbook. I can implement it there if > it's ok for everybody. > > Regards, > Nicolas > > > -----Message d'origine----- > > De : mod_python-bounces at modpython.org > > [mailto:mod_python-bounces at modpython.org] De la part de > Nicolas Lehuen > > Envoyé : samedi 16 octobre 2004 15:41 À : 'Graham Dumpleton' > > Cc : mod_python at modpython.org > > Objet : [mod_python] RE: Bug in mod_python.c causing > > multiple/redundantinterpreter creation. > > > > Thanks Graham, that's great news that you managed to find > the issue. > > I'll see if I can make a build for Win32 with your patch > and tell you > > if it works for me. > > > > Regards, > > > > Nicolas > > > > > -----Message d'origine----- > > > De : Graham Dumpleton [mailto:grahamd at dscpl.com.au] Envoyé > > : samedi 16 > > > octobre 2004 15:05 À : Nicolas Lehuen Cc : 'Daniel Popowich'; > > > mod_python at modpython.org Objet : Bug in mod_python.c causing > > > multiple/redundant interpreter creation. > > > > > > > > > On 09/10/2004, at 6:10 PM, Nicolas Lehuen wrote: > > > > > > > Hi Graham, what are the pros and cons of using the imp > > > module versus > > > > an execfile call or an exec call ? > > > > > > > > Using Apache 2.0.52 on win32, with a multithreaded MPM, I > > > have noticed > > > > a few problem with mod_python.publisher which reloads the > > > same module > > > > many times when many threads try to load it concurrently. > > It seems > > > > that either mod_python.publisher or the imp module does not > > > properly > > > > handle this situation. The result is that I have many > > > instances of the > > > > same module in memory, so I got many connection pools > > > instead of only > > > > one and so on. > > > > That's > > > > why I decided to fix this and other problems (the famous > > 'only one > > > > index.py in your application') by reimplementing my own > > > variant of the > > > > publisher. > > > > > > Found it. > > > > > > From what I can tell, there is actually a bug in > mod_python.c which > > > is causing the problem you describe as "reloads the same > module many > > > times when many threads try to load it concurrently". > > > > > > I am going to describe it quickly as it is getting late for > > me and I > > > need > > > some sleep. Have attached a patch for people to evaluate and try. > > > Note that the patch also includes a Mac OS X specific > fix. That fix > > > is harmless in itself so just leave it in. The important > changes are > > > where APR mutex is created and then locked/unlocked. > > > > > > The basic problem revolves around the Python dictionary > used to hold > > > the set of interpreters. The code in mod_python.c is > trying to use > > > the Python GIL to provide exclusive access to that dictionary and > > > any > > subsequent > > > creation of an interpreter. > > > > > > The only catch is that in creating a new interpreter, the Python > > > core is, in someway I don't understand, swapping thread states at > > some point > > > which > > > is allowing other threads to acquire the GIL. This is > > allowing other > > > threads > > > to check the set of interpreters for the same interpreter that is > > > currently being constructed and since it isn't there yet, > then also > > goes onto > > > create > > > it as well. > > > > > > The result is you get many threads in the same process > creating the > > > same named interpreter and loading the top level handler > defined by > > > the directive PythonHandler (or other earlier handler). > Thus you see > > > all > > the logged > > > messages about (re)importing the handler, as each thread > has managed > > > to get to the point of doing it. > > > > > > What should have happened is that if the lock was truly > > exclusive, the > > > subsequent threads would have waited until the first had actually > > > finished creating the interpreter in the first place. > > > > > > The patch for the problem is below. What I do is create > an APR mutex > > > and lock that around the outside of the acquisition of the GIL, > > the GIL > > > still > > > being required. > > > > > > In all the debugging I have in place in my code, I still note > > > something else not quite right in respect of the mod_python.apache > > module being > > > reloaded more than once by a thread. I'll dig further into that > > > tomorrow. > > > > > > > > > > > > *** mod_python.c.dist Fri Sep 24 15:11:32 2004 > > > --- mod_python.c Sat Oct 16 22:36:30 2004 > > > *************** > > > *** 31,36 **** > > > --- 31,38 ---- > > > * (In a Python dictionary) */ > > > static PyObject * interpreters = NULL; > > > > > > + static apr_thread_mutex_t* interpreters_lock = 0; > > > + > > > apr_pool_t *child_init_pool = NULL; > > > > > > /** > > > *************** > > > *** 124,129 **** > > > --- 126,133 ---- > > > name = MAIN_INTERPRETER; > > > > > > #ifdef WITH_THREAD > > > + apr_thread_mutex_lock(interpreters_lock); > > > + > > > PyEval_AcquireLock(); > > > #endif > > > > > > *************** > > > *** 149,154 **** > > > --- 153,160 ---- > > > > > > #ifdef WITH_THREAD > > > PyEval_ReleaseLock(); > > > + > > > + apr_thread_mutex_unlock(interpreters_lock); > > > #endif > > > > > > if (! idata) { > > > *************** > > > *** 469,474 **** > > > --- 475,483 ---- > > > const char *userdata_key = "python_init"; > > > apr_status_t rc; > > > > > > + /* fudge for Mac OS X with Apache 1.3 where > > Py_IsInitialized() > > > broke */ > > > + static int initialized = 0; > > > + > > > apr_pool_userdata_get(&data, userdata_key, > s->process->pool); > > > if (!data) { > > > apr_pool_userdata_set((const void *)1, userdata_key, > > > *************** > > > *** 490,502 **** > > > } > > > > > > /* initialize global Python interpreter if necessary */ > > > ! if (! Py_IsInitialized()) > > > { > > > > > > /* initialze the interpreter */ > > > Py_Initialize(); > > > > > > #ifdef WITH_THREAD > > > /* create and acquire the interpreter lock */ > > > PyEval_InitThreads(); > > > #endif > > > --- 499,514 ---- > > > } > > > > > > /* initialize global Python interpreter if necessary */ > > > ! if (initialized == 0 || ! Py_IsInitialized()) > > > { > > > + initialized = 1; > > > > > > /* initialze the interpreter */ > > > Py_Initialize(); > > > > > > #ifdef WITH_THREAD > > > + > > > apr_thread_mutex_create(&interpreters_lock,APR_THREAD_MUTEX_UN > > > NESTED,p); > > > + > > > /* create and acquire the interpreter lock */ > > > PyEval_InitThreads(); > > > #endif > > > > > > > > > > > > > > > > > > -- > > > Graham Dumpleton (grahamd at dscpl.com.au) > > > > > > > > > > > > _______________________________________________ > > Mod_python mailing list > > Mod_python at modpython.org > > http://mailman.modpython.org/mailman/listinfo/mod_python > >
|