Graham Dumpleton
grahamd at dscpl.com.au
Sun Oct 17 10:32:00 EDT 2004
I knew there was an issue with apache.py as well, with there being no locking, but didn't mention it in my email with the fix as I wanted to check that that was the source of the other problem I was seeing and make sure I came up with a proper fix. I was also thinking to suggest using imp.acquire_lock()/imp.release_lock(), but you need to have a fallback for when version of Python older than 2.3 as those functions were only introduced in that version. I'll come up with proper code later, but I was going to suggest something like: if not globals().has_key("_import_lock"): try: from threading import Lock except: class Lock: def acquire(self): pass def release(self): pass _import_lock = Lock() def import_module(module_name, autoreload=1, log=0, path=None): """ Get the module to handle the request. If autoreload is on, then the module will be reloaded if it has changed since the last import. """ _import_lock.acquire() try: # previous code of apache.import_module finally: _import_lock.release() This mess is needed to cater for when threading isn't compiled in to Python, use of an older version of Python that didn't have the imp lock functions and also to avoid risk that apache module might get touched and reloaded and thus lock being overwritten. With the way that the apache module caching uses sys.modules as its cache and doesn't keep a separate data structure, I am not sure that a two level locking mechanism would be a good idea. The lock for a two level scheme would at the moment need to be stored in the module itself, as is __mtime__ at present. Frankly, I reckon the whole thing should be redone using execfile() and a purpose built search routine in place of imp.find_module(). All you loose is the ability to use .pyc/.pyo files. If implemented correctly, you would also loose the restriction in mod_python.publisher about having two copies of the index.py file in different places, but that would actually be a gain. You could optionally, have the support for the __clone__ type method I was talking about before. Ie., in my module reloading mechanism, I load into a fresh module, but a rewrite of the apache scheme, would need to preserve the behaviour of reloading into the existing one otherwise existing code might break. The presence of a __clone__ method could turn that off and instead a new module used in a replacement for the apache scheme. If one is reimplementing it to this extent, then you could introduce a two level locking mechanism. On 17/10/2004, at 12:46 AM, Nicolas Lehuen wrote: > Indeed, having a look at apache.py, I didn't see any locking done to > prevent > the same module from being reloaded multiple times concurrently. What > we > need is to put a lock in the import_module method, so that two threads > cannot check the freshness of the publisher simultaneously. This can > be done > in a rather brutal way by calling imp.acquire_lock() at the beginning > of > import_module, put a try/finally block around the code of the function > and > call imp.release_lock() in the finally block : > > def import_module(module_name, autoreload=1, log=0, path=None): > """ > Get the module to handle the request. If > autoreload is on, then the module will be reloaded > if it has changed since the last import. > """ > > imp.acquire_lock() > try: > # previous code of apache.import_module > finally: > imp.release_lock() > > I tested this, and it works as expected. BTW, maybe it wouldn't work > without > your patch, Graham, since I could suddenly see two interpreters with > their > own copy of the mod_python.apache module. But the two fixes are > required for > it to work properly. > > Why did I wrote "in a rather brutal way" above ? Because this locking > scheme > is indeed brutal. It is a global lock, so each and every thread who > want to > get its handler must acquire it for a brief amount of time. The > problem is > that if a handler module is reloaded, it locks each and every other > thread > during its reloading, even if they should not be handled by the same > module. > A smarter way to lock would be to use a two-levels locking scheme, as I > described in my caching recipe in the Python Cookbook. I can implement > it > there if it's ok for everybody. -- Graham Dumpleton (grahamd at dscpl.com.au)
|