Mod_python  is an Apache server  module that embeds the Python interpreter within the server and provides an interface to Apache server internals as well as a basic framework for simple application development in this environment. The advantages of mod_python are versatility and speed.
This paper describes mod_python with the focus on the implementation, its philosophy and challenges.
It is intended for an audience already familiar with web application development in general and Apache in particular, as well as preferably mod_python itself. Knowledge of C and some understanding of Python internals is helpful as well.
Quite simply - it is integration of Python and Apache. Apache is a sort of a Swiss knife of web serving, especially the upcoming 2.0 version, which does not limit itself to HTTP but can serve any protocol for which there exists a module. Mod_python aims to provide direct access to the riches of this functionality for Python developers.
While speed is definitely a key benefit of mod_python and is taken very seriously during design decisions, it would be wrong to identify it as the sole reason for mod_python's existence.
At least for now, providing "inline Python" type functionality a lá PHP  is not a goal of this project. This is because the integration with Apache can still use a lot of improvement, and there does not seem to be a clear consensus within the Python community on how to embed Python code in HTML, with quite a few modules floating around, each doing it their own way.
Mod_python was initially released in April 2000 as a replacement for an earlier project called Httpdapy  (1998), which in turn was a port to Apache of Nsapy  (1997). Nsapy was based on an embedding example by Aaron Watters in the Internet Programming with Python  book.
Mod_python is stable enough to be used in production. The latest stable version at the time of this writing is 2.7.6. This version is written for 1.3 version of the Apache server. All of the development effort these days is focused on the next major version of mod_python, 3.0, which will support the upcoming Apache 2.0.
Mod_python consists of two components - an Apache dynamically
mod_python.so (this module can also be
statically linked into Apache) and a Python package
Assuming that mod_python is loaded into Apache, consider this configuration excerpt:
DocumentRoot /foo/bar <Directory /foo/bar> AddHandler python-program .py PythonHandler hello </Directory>
The following script named hello.py resides in the
from mod_python import apache def handler(req): req.send_http_header() req.write("hello %s" % req.remote_host) return apache.OK
A request to
http://yourdomain/somefile.py would result
in a page showing
"hello 188.8.131.52" where
184.108.40.206 is the IP of the client.
Just about every mod_python script begins with "
mod_python import apache".
apache is a
module inside the
mod_python package that provides the
interface to Apache constants (such as
OK) and many
useful functions. Note also the
which provides information about the current request, the connection
and an interface to more internal Apache functions, in this example
send_http_header() to send HTTP headers and
method to send data back to the client.
Apache processes incoming requests in phases. A phase is one of a series of small tasks that each need to take place to service a request. For example, there is a phase during which a URI is mapped to a file on disk, a phase during which authentication happens, a phase to generate the content, etc. Altogether, Apache 1.3 has 10 phases (11 if you consider clean-ups a phase).
The key architectural feature of the Apache server is that it can allow a module to process any phase of a request. This way a module can augment the server behavior in any way whatsoever. (module in this context does not refer to a Python module; an Apache module is usually a shared library or DLL that gets loaded at server startup, though modules can also be statically linked with the server).
Mod_python is an Apache module. What makes it different from most other Apache modules is that it itself doesn't do anything, but provide the ability to do what Apache modules written in C do to be done in Python. To put it another way, it delegates phase processing to user-written Python code.
This figure shows a diagram of Apache request processing.
Each Apache module can provide a handler function for any of the request processing phases. There are 4 types of return values possible for every handler.
DECLINED means the module declined to handle this phase, Apache moves to the next module in the module list.
OK means that this phase has been processed, Apache will move on to the next phase without giving any more modules an opportunity to handle this phase.
An error return (which is any HTTP  error constant) will cause Apache to produce an error page and jump to the Logging phase.
A special value of DONE means the whole request has been serviced, Apache will jump to the Logging phase.
The DECLINED return is somewhat deceiving, because many modules actually perform some action and then return DECLINED to give other modules an opportunity to handle the phase. The example below illustrates how the DECLINED return can be used in a handler that inserts a silly reply header into every request:
from mod_python import apache def fixup(req): req.headers_out["X-Grok-this"] = "Python-Psychobabble" return apache.DECLINED
At this point it should be a bit clearer how this functionality is different from CGI environment. Comparing CGI with mod_python is not very meaningful, because the scope of CGI is much narrower. One difference is that CGI is intended exclusively for dynamic content generation, which is not a requirement for mod_python scripts. For example, consider a mod_python script that implements a custom logging mechanism for the entire server, which plays no role in content generation.
Apache request processing makes use of a few important C structures, access to which is available through mod_python.
request_rec- the Request Record
request_rec is probably the largest and most
frequently encountered structure. It contains all the information
associated with processing a request (about 50 members total).
Mod_python provides a wrapper around
type is not meant to be used directly. Instead, each mod_python
handler gets a reference to an instance of a
class, a regular Python class which is a wrapper around
(which is a wrapper around
request_rec). This is so that
mod_python users could attach their own attributes to the
instance as a way to maintain state across different phases.
Request class provides methods for sending
headers and writing data to the client.
conn_rec- the Connection Record
conn_rec keeps all the information associated with
the connection. It is a separate structure from
because HTTP  allows for multiple requests
to be serviced over the same connection.
The connection record is accessible in mod_python through the
mp_conn built-in type, a reference to which is always
connection member of the
server_rec- the Server Record
server_rec keeps all the information associated with
the virtual server, such as the server name, its IP, port number,
etc. It is available via the
server member of the
Request object (
ap_table- Apache table
All key/value lists (for example RFC 822  headers) in Apache are stored in tables. A table is a construct very similar to a Python dictionary, except that both keys and values must be strings, key lookups are case insensitive and a table can have duplicate keys. Internally, Apache tables differ from Python dictionaries in that lookups do not using hashing, but rather a simple sequential search (although there was a proposal to use hashing in Apache 2.0).
Mod_python provides a wrapper for tables, an
object, which acts very much like a Python dictionary. If there are
mp_table will return a list. To allow
addition of duplicate keys,
mp_table provides an
Here is some code to illustrate how
from mod_python import apache def handler(req): t = apache.make_table() t["Set-Cookie"] = "Foo: bar;" t.add("Set-Cookie") = "Bar: foo;" s = t["Set-Cookie"] # s is ["Foo: bar;", "Bar: foo;"] return apache.DECLINED
The Python C API has a function to initialize a sub-interpreter,
Py_NewInterprer(). Here is an excerpt from the Python/C
API Reference manual  documenting this function:
Create a new sub-interpreter. This is an (almost) totally separate environment for the execution of Python code. In particular, the new interpreter has separate, independent versions of all imported modules, including the fundamental modules __builtin__ , __main__ and sys . The table of loaded modules (sys.modules) and the module search path (sys.path) are also separate. The new environment has no sys.argv variable. It has new standard I/O stream file objects sys.stdin, sys.stdout and sys.stderr (however these refer to the same underlying FILE structures in the C library).
This valuable feature of Python is not available from within Python itself, so most Python users are not even aware of it. But it makes good sense to take advantage of this functionality for mod_python, where one Apache process can be responsible for any number of unrelated applications at the same time. By default, mod_python creates a subinterpreter for each virtual server, but this behavior can be altered.
When a subinterpreter is created, a reference to it is saved in a Python dictionary keyed by subinterpreter names, which are always strings. This dictionary is internal to mod_python.
During phase processing, prior to executing the user Python code,
mod_python has to decide which interpreter to use. By default, the
interpreter name will be the name of the virtual server, which is
variable. If the PythonInterpPerDirectory is On,
then the name of the interpreter will be the directory being accessed
req->filename), and with
the directory where the Python*Handler directive currently
in effect is specified (which can be some parent directory). The
interpreter name can also be forced using PythonInterpreter
Once mod_python has a name for the interpreter, we check the dictionary of subinterpreters for this name, if it exists, we switch to it, else a new subinterpreter is created.
After mod_python has been given control by Apache to process a phase of a request, it steps through the following actions. (This is a simplified list.)
Determine the interpreter to use by looking at directives currently in effect, possibly the server name and the directory.
Get/Create a subinterpreter.
Get/Create a CallBack object. The CallBack object is a Python object whose methods provide all the functionality implemented in Python.
mp_request object. (for performance
mp_server objects are
created on-demand, so if the user code never refers to them they
would never be created)
CallBack.Dispatch() passing it a reference
mp_request and the name of the phase being
(From here on all the processing is done in Python rather than C)
Request object, a wrapper around
by prepending (if not already there) the directory being accessed.
Import (or if modification date is later than the last import, reload) the Python module specified in the configuration.
Locate the handler function/object inside the module.
Call the user
function/object passing it a reference to
Return the return value to mod_python.
(At this point execution moves back from Python to C)
Mod_python returns the return value and control to Apache.
Memory management is always a challenge for long running processes. One has to be very careful to always remember to free all memory allocated during request processing, no matter what errors take place.
To combat this problem, Apache provides memory pools. The
Apache API has a rich set of functions for allocating memory,
manipulating strings, lists, etc., and each of these functions always
takes a pool pointer. For example, instead of allocating memory using
malloc() et al, Apache modules allocate memory using
ap_palloc() and passing it a pool pointer. All memory
allocated in such a way can then be freed at once by destroying the
pool. Apache creates several pools with varying lifetimes, and
modules can create their own pools as well. The pool probably used
the most is the request pool, which is created for every request and
is destroyed at the end of the request.
Unfortunately, the Python interpreter cannot use Apache pools. So for the most part, mod_python programmer is at the mercy of the Python reference counting and garbage collecting mechanism (or lack thereof). In most cases it works just fine. In those cases where you do see the Apache process growing the simplest solution is to configure the server to recycle itself every few thousand requests using the MaxRequestsPerChild directive.
Apache provides API's to execute cleanup functions just before a
pool is destroyed. A cleanup is registered by calling the
ap_register_cleanup() C function which takes three
arguments: a pool pointer, a function pointer, and a void pointer to
some arbitrary data. Just before the pool is destroyed, the function
will be called and passed the pointer as the only argument.
Mod_python uses cleanups internally to destroy
Cleanups are available to mod_python users via
request.server.register_cleanup(). The former runs after
every request, the latter runs when the server exits.
As an astute reader probably noticed, mod_python (or rather
Apache) associates a handler with a directory (SetHandler) or
a file type (AddHandler), but not a specific file. In the
quick example in the beginning of this paper it really doesn't matter
what file is being accessed in the "/foo/bar" directory.
For as long as it ends with .py, same
hello handler will
be invoked always yielding the same result. In fact the file referred
to in the URI doesn't even need to exist.
A natural question would then be "Why can't I access multiple mod_python scripts in one directory?" (or "This isn't very useful!"). The answer here is that mod_python expects there to be an intermediate layer between it and the application. This layer (handler) is up to the user's imagination, but a couple of functional handlers (standard handlers) is bundled with mod_python.
This handler is for users who want to use their existing CGI code with mod_python. This handler sets up a fake CGI environment and runs the user program. A couple of interesting implementation challenges were encountered here.
At first, this handler used to set up the CGI environment through
os.environ object. For whatever reason
(Python bug?) this frequent environment manipulation introduced a
memory leak (about a kilobyte per request), so as a quick hack,
os.environ was replaced with a regular dictionary
object. This works fine for the most part, but is a problem for
scripts that use environment as a way to communicate with
subsequently called programs, notably some database interfaces which
expect database server information in an environment variable.
Another problem was that since cgihandler uses import/reload to
run a module, "indirect" module imports by the "main"
module would become noops after the first hit. This became a problem
for users who expected the top level code in those indirectly
imported modules to be executed for every hit. To solve this problem,
cgihandler now examines the
sys.modules variable before
and after importing the user scripts, and in the end, deletes any
newly appeared modules from
sys.modules, causing those
modules to be imported again next time.
Last but not the least, the CGI specification  strongly recommends that the server set the current directory to the directory in which the script is located. There is no thread safe way of changing the current directory and so the cgihandler uses a thread lock in multithreaded environment (e.g. Win32) which is held for as long as the script runs essentially forcing the server to process one cgihandler request at a time.
Given all of the above problems, the cgihandler is not a recommended development environment, but is regarded as a stop gap measure for users who have a lot of legacy CGI code, and should be used with caution and only if really necessary.
The publisher handler is probably the best way to start writing web applications using mod_python. The functionality of the publisher handler was inspired by the ZPublisher, a component of Zope .
The idea is that a URI is mapped to some object inside a module, the "/" in the URI having the same meaning as a "." in Python. So http://somedomain/somedir/module/object/method would invoke method method of object object inside module module in directory somedir, and the return value of the method would be sent to the client.
Here is a "hello world" example:
def hello(req, who="nobody"): return "Hello, %s!" % who
If the file containing this code is called
hello function can
be accessed via http://somedomain/somedir/myapp/hello which should
result in a page showing "Hello, nobody!", whereas
http://somedomain/somedir/myapp/hello?who=John should result in
Note that the first argument is a
which means all the advanced mod_python functionality is still
available when using the publisher handler.
Debugging mod_python applications can be difficult. Mod_python provides support for the Python debugger (pdb) via the PythonEnablePdb configuration directive, but its usability is limited because the debugger is an interactive tool that uses standard input and output and therefore can only be used when Apache is running in foreground mode (-X switch in Apache 1.3 or -DONE_PROCESS in 2.0).
Mod_python sends any traceback information to the server log, and with PythonDebug directive set to On (default is Off), the traceback information is sent to the client.
For programmers who like to use the
raise a variable optionally surrounded by "`"
(back quotes) from any point in the code with the PythonDebug
directive On. This will make
the value of the variable appear on the browser and is as effective
Mod_python is thread-safe and runs fine on Win32, where Apache is multithreaded.
One should be careful to make sure that any extension modules that an application uses are thread-safe as well. For example, many database access drivers on Windows are not thread safe, and some kind of a thread lock needs to be used to make sure no two threads try to run the driver code in parallel.
Interestingly, the Python interpreter itself isn't completely thread safe, and to run multiple threads it maintains a thread lock that is released every 10 Python bytecode instructions to let other threads run. If any, the negative impact of that is most likely negligible.
Those familiar with mod_perl  will notice that some functionality of mod_python is remarkably similar to mod_perl, for example the names of the Apache configuration directives are exactly the same except the word Perl is substituted for Python.
It would be wrong not to say that much of mod_python functionality, especially in the area of Apache configuration, was intentionally made functionally similar to mod_perl. Under the hood they have next to nothing in common, mainly because Perl and Python interpreters are quite different.
There were good reasons for similarities though. First, there is no sense in reinventing the wheel - mod_perl has encountered and solved many problems just as applicable to mod_python. Second, since both projects had similar goals, except the language of choice was different, it made sense to keep the outside look consistent, especially the Apache configuration. Oftentimes the person who has to deal with the Apache config is a System Administrator, not a programmer, and consistency would make SysAdmin's job easier.
In a web application environment speed and low overhead are extremely important. Many people don't appreciate how really important it is until their site gets featured on another big volume site (the so called "/. effect") but instead of getting lots of hard earned publicity, they get a bunch of frustrated web surfers trying to get to a site so overloaded that no one can access it.
Considering this angle, C always wins over Python. If the author of mod_python had more time, a much larger percentage of mod_python would be implemented in C. But given the length of time it takes to write quality C code, initially a decision was made to implement in C only those parts which cannot be done in Python.
SWIG  was given some consideration as a tool to provide the mapping to Apache C structures (such as request_rec). There are a few problems with SWIG. The main advantages of SWIG are speed and ease with which an interface to a C library can be created. The resulting C code is not necessarily meant to be easy to read, and SWIG itself becomes yet another tool that is required for compilation in an already pretty complicated build environment. Altogether, for a long-term project like mod_python, where quality is more important than the timeline, SWIG does not seem to be the right choice.
As has been mentioned before, the main focus of development today
is compatibility with Apache 2.0. Apache 2.0 is architecturally quite
a bit different from its predecessor (1.3), so much so that it would
not be very easy or practical to try to write code that works with
both 1.3 and 2.0. It is possible, but the code becomes a tangle of
#ifedef statements because the majority of the API
functions have been renamed. So the next major version of mod_python
will support Apache 2.0 only.
Apache 2.0 is actually a combination of two software packages. One is the server itself, the other is the underlying library, the Apache Portable Runtime (APR) . The APR is a general purpose library designed to provide functionality common in daemons of all kinds and to abstract the OS specifics (thus "Portable"). Future versions of mod_python will eventually provide an interface to large part or perhaps all of the APR.
Another big improvement in 2.0 is the introduction of filters and connection handlers. The alpha version of mod_python 3.0 already supports filters. (A filter would be the right place to implement inline Python). A connection handler is a handler at a level below HTTP. Using a connection handler one could implement an entirely different protocol, e.g. FTP. At the time of this writing mod_python 3.0 alpha does not support connection handlers, but such support is in the plans.
 Mod_python. http://www.modpython.org/
 Apache Http Server. http://httpd.apache.org/
 Httpdapy. http://www.ispol.com/home/grisha/httpdapy
 Nsapy. http://www.ispol.com/home/grisha/nsapy
 Aaron Watters, Guido van Rossum, James C. Ahlstrom, Internet Programming with Python, M&T Books, 1996.
 Guido van Rossum, Fred L. Drake, Jr, Python/C API Reference Manual, PythonLabs. http://www.python.org/doc/current/api/.
 R. Fielding, UC Irvine, J. Gettys, J. Mogul, DEC, H. Frystyk, T. Berners-Lee, MIT/LCS, "Hyper Text Transfer Protocol -- HTTP/1.1", RFC 2068, IETF January 1997. http://www.ietf.org/rfc/rfc2068.txt
 Crocker, D., "Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, UDEL, August 1982. http://www.ietf.org/rfc/rfc822.txt
 Zope http://www.zope.org/
 Mod_perl, Apache/Perl Integration. http://perl.apache.org/
 Apache Portable Runtime. http://apr.apache.org/
 Simplified Wrapper and Interface Generator. http://www.swig.org/
 Ken A L Coar, The WWW Common Gateway Interface Version 1.1. http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt
 PHP. http://www.php.net/