[mod_python] Threading and apache.import_module

Sat Jun 18 15:29:46 EDT 2005

I use the "two-check" approach to this, which allows multiple threads  
to get at an already-imported resource without blocking each other,  
while allowing the actual import to occur in a threadsafe  
manner...suppose every access of a module calls "verify_module" which  
reloads the module if needed, using the functions _needs_reload to  
perform a check and _reload_module to do the actual reload:

    import threading
    mutex = threading.RLock()

    def verify_module(module):
        if _needs_reload(module):
           mutex.acquire()
           try:
               if _needs_reload(module):
                    _reload_module(module)
           finally:
               mutex.release()

in the normal case, threads will call _needs_reload() without  
synchronization and return, without any bottleneck occuring.  when  
the reload is actually required, threads will line up onto the  
mutex.acquire() and re-check once entering the critical section, thus  
allowing only the first thread inside to actually perform the reload.

an additional optimization is to use a unique mutex for each  
module...that way multiple threads that are reloading several  
different modules also don't block each other.

the issue of other threads executing code against the module while  
its being reloaded is not really addressed here.  as a normal reload 
() or imp.load_module() doesnt remove any of the old module values,  
its usually (but not always) the case that nothing really  
happens....unless there are persistent pointers to the module's  
globals lying around elsewhere, which then become outdated.    I did  
see a reload() implementation that loads a new module under a  
temporary name, then moved it's dictionary over, which attempts to be  
more atomic...I havent tried it but its over at http:// 
lists.canonical.org/pipermail/kragen-hacks/2002-January/000302.html .

On Jun 17, 2005, at 3:09 PM, Dan Eloff wrote:

> Well this is the biggest problem in my otherwise threadsafe code. From
> what I've read, importing/reloading modules in a threaded environment
> is just plain dangerous. There could be any number of requests
> executing code from and accessing global variables in a module when it
> changes and gets reloaded. Then you have the two level problem of what
> happens to the executing requests, as well as what happens when
> multiple incoming requests all spot that the module has changed, and
> reload it multiple times in quick succession, re-executing code in the
> module scope (this is pretty much worst case)
>
> I know Graham knows a lot more about this than I do (help!).
>
> I do have a lot of flexibility here, since I'm using my own publisher
> I do not have to use apache.import_module. In fact I would prefer to
> not have the whole importing/reloading business affect my code outside
> of the publisher, that way I can change things internally at will.
> Clearly one option is to create a simple import function that imports
> once only in a production environment, and every time in a development
> env, but I'm still interested in exploring the options.
>
> -Dan
>
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
>