[mod_python] Udate python modules without restarting Apache

Thu Oct 14 22:10:05 EDT 2004

On 14/10/2004, at 7:32 PM, Nicolas Lehuen wrote:
>> FWIW, if using execfile() I have since found that it is
>> probably better to use:
>>
>>   import imp
>>   module = imp.new_module("z")
>>   module.__file__ = "z.py"
>>   execfile("z.py",module.__dict__)
>>
>> That way, executing type() on the module gives <type
>> 'module'> and thus things one expects to work on modules
>> will. The imp.new_module() method does not result in the
>> module being listed in sys.modules.
>
> Ok, thanks for the tip, Graham.
>
> ...
>
> So I guess those two function do exactly the same thing. I'll modify my
> ModuleCache to use imp.new_module instead of a fake Module object.

Another thing I have done is to implement the cloning mechanism for 
data that I
mentioned in my prior emails. Ie.,

         module = imp.new_module(label)
         module.__file__ = file
         if hasattr(cache.module,"__clone__") and \
             callable(cache.module.__clone__):
           cache.module.__clone__(module)
         execfile(file,module.__dict__)
         cache.module = module

Thus, although the newly loaded module is constructed into a fresh 
module, there is
the option of copying over selected data from the existing module. This 
is done by
calling a __clone__() method if present in the existing module to 
populate the fresh
module before loading the code in. Because it is a function, any 
necessary code
could be put in it, including code which acquires any locks while 
copying the data.

For example, a content handler might contain:

   from threading import Lock

   # This method prior to execfile() being run, but only
   # when the module had previously been loaded.

   def __clone__(module):
     _lock.acquire()
     module._lock = _lock
     module._data = _data
     _data["reloads"] = _data["reloads"] + 1
     _lock.release()

   # This global scope code executed when execfile() run.
   # Check that data doesn't already exist, as don't want
   # to overwrite it if it does. This will be the case
   # when it is copied from existing module.

   if not globals().has_key("_lock"):
     _lock = Lock()
     _data = { "reloads" : 0 }

   # Data which should always be reset on fresh import
   # would still follow here.

   _cache = {}

Using execfile() on a fresh module and using __clone__() in this way 
solves some of the
problems I mentioned in my prior emails.

First, because a fresh module is used, you don't get problems with code 
being modified
while another thread may be executing in the context of the existing 
module that would
have otherwise been overwritten.

You can selectively copy over only the data which should be present in 
the context of
the code being reconstructed. Ie., you could throw away certain types 
of cached data.
If the modified code had removed functions, they will not show in the 
newly loaded
module.

The use of a method rather than an export list of data to copy across, 
means that locks
can be acquired while data is being copied to prevent another thread 
accessing it at
the same time.

The intent with copying locks and data is that it is shared between old 
and new modules
at least for the period that any existing thread may be executing in 
the context of the
old module. Because the old module will no longer be referenced from 
the cache, once
the thread executing in the context of the old module finishes, the old 
module will
disappear and data will be exclusive to the new module.

This does mean though that any data should really be dictionaries or 
class instances Ie.,
things that can be shared properly and a change through either 
reference is reflected in
the other. Eg.

   >>> a={"a":1}
   >>> b=a
   >>> a
   {'a': 1}
   >>> b
   {'a': 1}
   >>> b["c"]=2
   >>> a
   {'a': 1, 'c': 2}
   >>> b
   {'a': 1, 'c': 2}

This isn't going to work if the data were simple integers or strings, 
as once a copy is
made, a change from one module, by way of assignment, isn't going to be 
reflected in the
other.

You could also feasibly implement fixup code in a new module which 
would be executed upon
the reload to convert data in an old form to a new form, but the danger 
of that is that you
don't know whether a reload may occur for one fixup, before you later 
change the module
again and where the subsequent fixup expects data in the intermediate 
format. If the
format of data is going to change drastically, you should really stop 
Apache, load in
your new code and restart. Otherwise, you would have to use a variable 
which tracks what
version the data format is in and for fixup code be able to cope with 
various older
versions of data when converting to a new format.

Anyway, interesting stuff to play with.

--
Graham Dumpleton (grahamd at dscpl.com.au)