[mod_python] Session Pickling Error II - 3.2.2b

Graham Dumpleton grahamd at dscpl.com.au
Tue Oct 4 19:12:45 EDT 2005


Terence MacDonald wrote ..
> 
> On Wed, 2005-10-05 at 07:28 +1000, Graham Dumpleton wrote:
> > The only way I can see there possibly being a problem is if the
> > apache.import_module() method were being used to import
> > pyPgSQL module and it was getting reloaded. I would expect
> > though that pyPgSQL is a standard module in site-packages
> > though so would be imported using "import". Yes/No?
> > 
> the PyPgSQL module PgSQL.py is imported using the usual "import", this
> import resides in my database module which itself is imported, by my
> processing modules, using apache.import_module(). I have PythonOption
> AutoReload set to 'on'
> 
> I have stopped using apache.import_module inline to import my database
> module when required and resorted to 'top of the file' import instead.
> The problem appears (fingers crossed!) to have gone away.
> 
> I am still not sure what the problem or the solution is/was, except that
> it is related to import_module, pickling session data that has object
> instances and a pickling helper function (in this case a function named
> _B that aids in pickling PgBoolean instances and is declared in the
> init.py file of the PyPgSQL package) not being at the same address....
> If that makes sense

In short, in mod_python 3.1.4 and earlier, pickling of anything which
depends on a function object which resides in a module imported using
apache.import_module() will not be reliable. With the changes in 3.2, it
simply will not work.

Lets start with the following tests:

# >>> import pickle
# >>> def a(): pass
# ... 
# >>> pickle.dumps(a)
# 'c__main__\na\np0\n.'
# >>> z = a
# >>> pickle.dumps(z)
# 'c__main__\na\np0\n.'

As you can see, it is possible to pickle the function object. This can
be done even through a copy of the function object by reference,
although in that case the pickled object still refers to the original
function object.

Now lets delete the original function object and pickle the copy again.

# >>> del a
# >>> pickle.dumps(z)
# Traceback (most recent call last):
# ....... deleted traceback
# pickle.PicklingError: Can't pickle <function a at 0x612b0>: it's not found as __main__.a

Because the original function object was deleted from where it was
created, one cannot now even pickle the copy.

Now lets recreate the original function object.

# >>> def a(): pass
# ... 
# >>> pickle.dumps(z)
# Traceback (most recent call last):
# ....... deleted traceback
# pickle.PicklingError: Can't pickle <function a at 0x612b0>: it's not the same object as __main__.a

Notice how the exception message has changed. It recognises that "a"
exists but realises that it is actually a different function object from
which the "z" copy was originally made.

Where problems can start occuring in mod_python 3.1.4 and earlier is if
the function object is cached in some data object which is held outside
of the module the function object was defined in. If the original module
holding the original function object were now reloaded because of the
automatic module reloading mechanism implemented by the
apache.import_module() function, an attempt to pickle the data object
which had cached the function object will fail. This is because the
original function object which had been copied from will have been
overrwritten by a new one when the module was reloaded.

This sort of problem although it will not occur for an instance of a
class object, will occur for the class object type itself.

# >>> class B: pass
# ... 
# >>> b=B()
# >>> pickle.dumps(b)
# '(i__main__\nB\np0\n(dp1\nb.'
# >>> del B
# >>> pickle.dumps(b)
# '(i__main__\nB\np0\n(dp1\nb.'

# >>> class B: pass
# ... 
# >>> pickle.dumps(B)
# 'c__main__\nB\np0\n.'
# >>> C = B
# >>> pickle.dumps(C)
# 'c__main__\nB\np0\n.'
# >>> del B
# >>> pickle.dumps(C)
# Traceback (most recent call last):
# ........ deleted traceback
# pickle.PicklingError: Can't pickle <class __main__.B at 0x53ab0>: it's not found as __main__.B

Thus, in practice, even in mod_python 3.1.4 and earlier I would not
recommend trying to pickle function objects or class object types,
unless you are absolutely gauranteed that the module that the original
function object instance or class object type resides in is only
imported using "import" and is never in anyway reloaded.

If wanted to ensure that no strange problems were going to occur, I
would possibly go as far as suggesting that only basic Python types,
ie., scalars, tuples, lists and dictionaries, be pickled along with
Session objects.

One other obscure area that would worry me in respect of pickling and
mod_python is that different parts of a web site can have different
PythonPath settings. The issue here is that when unpickling certain
objects, such as function objects and class object types, the original
module containing a type must be importable within the context that the
unpickling is occuring. This is so that if it hasn't already been
imported it can be automatically imported.

What though happens when the pickling occurs in one part of the
namespace of a web site and it is unpickled in another where PythonPath
is set differently and the required module hadn't already been imported,
but doesn't appear in any directory specificed by PythonPath. Because of
how mod_python can overlay same named modules over the top of each other
in mod_python 3.1.4, it may also not import the correct module, plus
there is a mixing of the "import" and "apache.import_module()"
mechanisms.

I may be overly paranoid, but that is what defensive programming is all
about if you really want to ensure you are building a robust system
where you avoid any hint of trouble. :-)

Anyway, hope this might help in some way to explain your problems.

Graham






More information about the Mod_python mailing list