G . Sumner Hayes
SumnerH at fool.com
Mon May 14 18:33:44 EST 2001
Brief version: I have a patch to make PythonImport take place during server init instead of child init--is there any chance it would be accepted? Long version: So the PythonImport documentation says: : Tells the server to import the Python module module at process : startup. This is useful for initialization tasks that could be : time consuming and should not be done at the request processing : time, e.g. initializing a database connection. : : The import takes place at child process initialization, so the : module will actually be imported once for every child process : spawned. Having the import take place at every child process initialization seems to run counter to the purpose of the statement (doing time consuming server startup stuff). It also makes the various children have potentially different PythonImport'd contents if the module(s) imported that way are edited in place. I'm wondering if anyone actually wants that behavior; it seems like it'd be better to do the PythonImport exactly once at server startup. Rationale 1: Suppose I have a ton of expensive startup work to do in my PythonImport lines. Furthermore, I need to be able to do apache graceful restarts even while under heavy load. Here's a sketch of how the apache graceful restarts work--I'll pretend there are 3 different types of apache process, a "master control program" or MCP which coordinates startup, a "pool master" or PM which spawns children as needed, and the child processes or CPs. 1. A user sends MCP a SIGUSR1 signal indicating a graceful restart should take place. The old PM and CPs continue handling incoming requests. 2. MCP starts up a new PM, which reads all startup files and processes them; this includes running the server init functions for apache modules (in particular mod_python). 3. New PM finishes reading startup and fork()s off new CPs (as per the MinSpareServers and related apache directives). After forking new CPs, the new PM tells the MCP that it is ready to go. 4. MCP tells old PM to shut down and routes all new incoming requests to the new PM. 5. Old PM tells old CPs to exit; if they're idle, they exit immediately. If they're serving a request, they finish doing so and then exit. Pretty elegant; you can change the entire apache config on the fly without dropping any connections. You can even update shared objects this way without dropping connections; you can e.g. migrate from python 1.5.2 to Stackless Python 2.0 (potentially with a new version of mod_python) seamlessly. _However_, step (3) gets us in trouble if PythonImports happen in the child process--the old PM will be exiting while the new CPs are busy doing whatever (potentially many-second or evan minute-long) the PythonImports say to do. If PythonImports took place in the server init, they'd get done before the new PM/CPs took over; the old ones would continue serving requests until the new ones were ready. (Yes, the setup involved will have high-availability measures in place, but it'd be really nice to have this feature for a number of reasons). Rationale 2: It's far more "apache-like" for configuration data to be read exactly 1 time at startup and then re-read when explicitly asked to do so (via a graceful restart). As it stands now, if I'm importing mystuff.py via PythonImport and I edit that file while the server is running, some apache children (which started before my edits) will have the old mystuff module and others (which happened to start after my edits) will have the new one. That's confusing and potentially dangerous, and if I'm editing files in place (groan, I know) it's full of nasty races. Let me know and I'll send a patch. Sumner -- rage, rage against the dying of the light
|