[mod_python] Threading and apache.import_module

michael bayer mike_mp at zzzcomputing.com
Sun Jun 19 11:58:34 EDT 2005


I just searched around for this, and hey youre right, it does.     
Myghty imports "in-memory" modules in a similar way (i.e. via  
imp.new_module + exec)....but I hadnt thought of just bypassing  
sys.modules altogether.  A "request-level" module really doesnt need  
to be generally importable so that now makes sense.

I would most like to find a way to do this while also taking  
advantage of .pyc files (seems like, create .pyc file if it doesnt  
exist, else if it exists, read four bytes + one long value, then use  
marshal.load() to create a code object, then exec that).


On Jun 19, 2005, at 4:19 AM, Nicolas Lehuen wrote:

> Just as a note, those issues have been thoroughly covered by Graham
> Dumpleton in his Vampire project, and some of his ideas have been
> retrofitted into the development version of mod_python.
>
> Regards,
> Nicolas
>
> 2005/6/18, michael bayer <mike_mp at zzzcomputing.com>:
>
>> I use the "two-check" approach to this, which allows multiple threads
>> to get at an already-imported resource without blocking each other,
>> while allowing the actual import to occur in a threadsafe
>> manner...suppose every access of a module calls "verify_module" which
>> reloads the module if needed, using the functions _needs_reload to
>> perform a check and _reload_module to do the actual reload:
>>
>>     import threading
>>     mutex = threading.RLock()
>>
>>     def verify_module(module):
>>         if _needs_reload(module):
>>            mutex.acquire()
>>            try:
>>                if _needs_reload(module):
>>                     _reload_module(module)
>>            finally:
>>                mutex.release()
>>
>> in the normal case, threads will call _needs_reload() without
>> synchronization and return, without any bottleneck occuring.  when
>> the reload is actually required, threads will line up onto the
>> mutex.acquire() and re-check once entering the critical section, thus
>> allowing only the first thread inside to actually perform the reload.
>>
>> an additional optimization is to use a unique mutex for each
>> module...that way multiple threads that are reloading several
>> different modules also don't block each other.
>>
>> the issue of other threads executing code against the module while
>> its being reloaded is not really addressed here.  as a normal reload
>> () or imp.load_module() doesnt remove any of the old module values,
>> its usually (but not always) the case that nothing really
>> happens....unless there are persistent pointers to the module's
>> globals lying around elsewhere, which then become outdated.    I did
>> see a reload() implementation that loads a new module under a
>> temporary name, then moved it's dictionary over, which attempts to be
>> more atomic...I havent tried it but its over at http://
>> lists.canonical.org/pipermail/kragen-hacks/2002-January/000302.html .
>>
>> On Jun 17, 2005, at 3:09 PM, Dan Eloff wrote:
>>
>>
>>> Well this is the biggest problem in my otherwise threadsafe code.  
>>> From
>>> what I've read, importing/reloading modules in a threaded  
>>> environment
>>> is just plain dangerous. There could be any number of requests
>>> executing code from and accessing global variables in a module  
>>> when it
>>> changes and gets reloaded. Then you have the two level problem of  
>>> what
>>> happens to the executing requests, as well as what happens when
>>> multiple incoming requests all spot that the module has changed, and
>>> reload it multiple times in quick succession, re-executing code  
>>> in the
>>> module scope (this is pretty much worst case)
>>>
>>> I know Graham knows a lot more about this than I do (help!).
>>>
>>> I do have a lot of flexibility here, since I'm using my own  
>>> publisher
>>> I do not have to use apache.import_module. In fact I would prefer to
>>> not have the whole importing/reloading business affect my code  
>>> outside
>>> of the publisher, that way I can change things internally at will.
>>> Clearly one option is to create a simple import function that  
>>> imports
>>> once only in a production environment, and every time in a  
>>> development
>>> env, but I'm still interested in exploring the options.
>>>
>>> -Dan
>>>
>>> _______________________________________________
>>> Mod_python mailing list
>>> Mod_python at modpython.org
>>> http://mailman.modpython.org/mailman/listinfo/mod_python
>>>
>>>
>> _______________________________________________
>> Mod_python mailing list
>> Mod_python at modpython.org
>> http://mailman.modpython.org/mailman/listinfo/mod_python
>>
>


More information about the Mod_python mailing list