[mod_python] PythonImport question

G . Sumner Hayes SumnerH at fool.com
Mon May 14 18:33:44 EST 2001


Brief version:
I have a patch to make PythonImport take place during server init
instead of child init--is there any chance it would be accepted?

Long version:
So the PythonImport documentation says:

: Tells the server to import the Python module module at process
: startup. This is useful for initialization tasks that could be
: time consuming and should not be done at the request processing
: time, e.g. initializing a database connection.
:
: The import takes place at child process initialization, so the
: module will actually be imported once for every child process
: spawned.

Having the import take place at every child process
initialization seems to run counter to the purpose of the
statement (doing time consuming server startup stuff).  It
also makes the various children have potentially different
PythonImport'd contents if the module(s) imported that way
are edited in place.

I'm wondering if anyone actually wants that behavior; it seems
like it'd be better to do the PythonImport exactly once at server
startup.

Rationale 1:
Suppose I have a ton of expensive startup work to do in my
PythonImport lines.  Furthermore, I need to be able to do
apache graceful restarts even while under heavy load.  Here's
a sketch of how the apache graceful restarts work--I'll pretend
there are 3 different types of apache process, a "master control
program" or MCP which coordinates startup, a "pool master" or
PM which spawns children as needed, and the child processes
or CPs.

1.  A user sends MCP a SIGUSR1 signal indicating a graceful
restart should take place.  The old PM and CPs continue
handling incoming requests.
2.  MCP starts up a new PM, which reads all startup files and
processes them; this includes running the server init functions
for apache modules (in particular mod_python).
3.  New PM finishes reading startup and fork()s off new CPs (as
per the MinSpareServers and related apache directives).  After
forking new CPs, the new PM tells the MCP that it is ready to
go.
4.  MCP tells old PM to shut down and routes all new incoming
requests to the new PM. 
5.  Old PM tells old CPs to exit; if they're idle, they exit
immediately.  If they're serving a request, they finish doing
so and then exit.

Pretty elegant; you can change the entire apache config on the
fly without dropping any connections.  You can even update shared
objects this way without dropping connections; you can e.g.
migrate from python 1.5.2 to Stackless Python 2.0 (potentially
with a new version of mod_python) seamlessly.

_However_, step (3) gets us in trouble if PythonImports happen
in the child process--the old PM will be exiting while the new
CPs are busy doing whatever (potentially many-second or evan
minute-long) the PythonImports say to do.

If PythonImports took place in the server init, they'd get done
before the new PM/CPs took over; the old ones would continue
serving requests until the new ones were ready.

(Yes, the setup involved will have high-availability measures
in place, but it'd be really nice to have this feature for a
number of reasons).

Rationale 2:
It's far more "apache-like" for configuration data to be read
exactly 1 time at startup and then re-read when explicitly asked
to do so (via a graceful restart).  As it stands now, if I'm
importing mystuff.py via PythonImport and I edit that file while
the server is running, some apache children (which started before
my edits) will have the old mystuff module and others (which
happened to start after my edits) will have the new one.  That's
confusing and potentially dangerous, and if I'm editing files
in place (groan, I know) it's full of nasty races.

Let me know and I'll send a patch.

  Sumner

-- 
rage, rage against the dying of the light



More information about the Mod_python mailing list