When working with mod_python, it is important to be aware of a feature of Python that is normally not used when using the language for writing scripts to be run from command line.Python C API provides the ability to create subinterpreters. A more detailed description of a subinterpreter is given in the documentation for the Py_NewInterpreter function. For this discussion, it will suffice to say that each subinterpreter has its own separate namespace, not accessible from other subinterpreters.
At server start-up or mod_python initialization time, mod_python initializes the global interpreter. The global interpreter contains a dictionary of subinterpreters. Initially, this dictionary is empty. With every hit, as needed, subinterpreters are created, and references to them are stored in this dictionary. The key, also known as interpreter name, is a string representing the path where the Handler directive was encountered, or where the the actual file is (this depends on whether the
PythonInterpPerDirectory
directive is in effect).Once created, a subinterpreter will be reused for subsequent requests, but it is never destroyed until the Apache child process dies.
A handler is a function that processes a particular stage of a request. Apache processes requests in stages - read the request, process headers, provide content, etc. For every stage, it will call handlers, provided by either the Apache core or one of its modules, such as mod_python, which passes control to functions provided b the user and written in Python. A handler written in Python is not any different than a handler written in C, and follows these rules:A handler function will always be passed a reference to a
request
object.Every handler can return
apache.OK
, meaning this stage of the request was handled by this handler and no errors occurred.apache.DECLINED
, meaning this handler refused to handle this stage of the request and Apache needs to look for another handler.apache.HTTP_ERROR
, meaning an HTTP error occurred.HTTP_ERROR
can be:HTTP_CONTINUE = 100 HTTP_SWITCHING_PROTOCOLS = 101 HTTP_PROCESSING = 102 HTTP_OK = 200 HTTP_CREATED = 201 HTTP_ACCEPTED = 202 HTTP_NON_AUTHORITATIVE = 203 HTTP_NO_CONTENT = 204 HTTP_RESET_CONTENT = 205 HTTP_PARTIAL_CONTENT = 206 HTTP_MULTI_STATUS = 207 HTTP_MULTIPLE_CHOICES = 300 HTTP_MOVED_PERMANENTLY = 301 HTTP_MOVED_TEMPORARILY = 302 HTTP_SEE_OTHER = 303 HTTP_NOT_MODIFIED = 304 HTTP_USE_PROXY = 305 HTTP_TEMPORARY_REDIRECT = 307 HTTP_BAD_REQUEST = 400 HTTP_UNAUTHORIZED = 401 HTTP_PAYMENT_REQUIRED = 402 HTTP_FORBIDDEN = 403 HTTP_NOT_FOUND = 404 HTTP_METHOD_NOT_ALLOWED = 405 HTTP_NOT_ACCEPTABLE = 406 HTTP_PROXY_AUTHENTICATION_REQUIRED= 407 HTTP_REQUEST_TIME_OUT = 408 HTTP_CONFLICT = 409 HTTP_GONE = 410 HTTP_LENGTH_REQUIRED = 411 HTTP_PRECONDITION_FAILED = 412 HTTP_REQUEST_ENTITY_TOO_LARGE = 413 HTTP_REQUEST_URI_TOO_LARGE = 414 HTTP_UNSUPPORTED_MEDIA_TYPE = 415 HTTP_RANGE_NOT_SATISFIABLE = 416 HTTP_EXPECTATION_FAILED = 417 HTTP_UNPROCESSABLE_ENTITY = 422 HTTP_LOCKED = 423 HTTP_FAILED_DEPENDENCY = 424 HTTP_INTERNAL_SERVER_ERROR = 500 HTTP_NOT_IMPLEMENTED = 501 HTTP_BAD_GATEWAY = 502 HTTP_SERVICE_UNAVAILABLE = 503 HTTP_GATEWAY_TIME_OUT = 504 HTTP_VERSION_NOT_SUPPORTED = 505 HTTP_VARIANT_ALSO_VARIES = 506 HTTP_INSUFFICIENT_STORAGE = 507 HTTP_NOT_EXTENDED = 510As an alternative to returning an HTTP error code, handlers can signal an error by raising the
apache.SERVER_RETURN
exception, and providing an HTTP error code as the exception value, e.g.raise apache.SERVER_RETURN, apache.HTTP_FORBIDDENHandlers can send content to the client using the
request.write()
function. Before sending the body of the response, headers must be sent using therequest.send_http_header()
function.Client data, such as POST requests, can be read by using the
req.read()
function.NOTE:The directory of the Apache Python*Handler in effect is prepended to the Python Path. If the directive was specified in a server config file outside any <Directory>, then the directory is unknown and not prepended.
An example of a minimalistic handler might be:
from mod_python import apache def requesthandler(req): req.content_type = "text/plain" req.send_http_header() req.write("Hello World!") return apache.OK
The Python Application Programmer interface to Apache internals is contained in a module appropriately namedapache
, located inside themod_python
package. This module provides some important objects that map to Apache internal structures, as well as some useful functions, all documented below.The
apache
module can only be imported by a script running under mod_python. This is because it depends on a built-in module_apache
provided by mod_python. It is best imported like this:from mod_python import apacheMod_python'sapache
module defines the following objects and functions. For a more in-depth look at Apache internals, see the Shambhala API Notes.log_error(message, [level=level], [server=server])
An interface to the Apacheap_log_error
function. message is a string with the error message, level is one of the following constants:APLOG_EMERG APLOG_ALERT APLOG_CRIT APLOG_ERR APLOG_WARNING APLOG_NOTICE APLOG_INFO APLOG_DEBUG APLOG_NOERRNOserver is a reference to aserver
object which is passed as a member of the request,request.server
. If server is not specified, then the error will be logged to the default error log, otherwise it will be written to the error log for the appropriate virtual server.make_table()
Returns a new emptytable
object.Table Object
Thetable
object is a Python mapping to the Apachetable
. Thetable
object performs just like a dictionary, with the only difference that key lookups are case insensitive.Much of the information that Apache uses is stored in tables. For example,
request.header_in
andrequest.headers_out
.All the tables that mod_python provides inside the
request
object are actual mappings to the Apache structures, so changing the Python table also changes the underlying Apache table.In addition to normal dictionary-like behavior, the table object also has an add(string key, string val) method. Add() allows for creating duplicate keys, which is useful when multiple headers, such as Set-Cookie are required.
Request Object
Therequest
object is a Python mapping to the Apacherequest_rec
structure.When a handler is invoked, it is always passed a single argument - the
request
object. Here are the attributes of therequest
object:Functions
- add_handler(string htype, string handler [,string dir])
- Allows dynamic handler registration. htype is a name of any of the apache Python*Handler directives, e.g.
"PythonHandler"
. handler is the name of the module and the handler function. Optional dir is the name of the directory to be added to the python path. If no directory is specified, then, if there is already a handler of the same type specified, its directory is inherited, otherwise the directory of the presently executing handler is used.A handler added this way only persists throughout the life of the request. It is possible to register more handlers while inside the handler of the same type. One has to be careful as to not to create an infinite loop this way.
Dynamic handler registration is a useful technique that allows the code to take a decision on what will happen next. A typical example might be a PythonAuthenHandler that will assign different PythonHandlers based on the authrntication level, something like:
if manager: req.add_handler("PythonHandler", "menu::admin") else: req.add_handler("PythonHandler", "menu::basic")Note: at this point there is no checking being done on the validity of the handler name. If you pass this function an invalid handler it will simply be ignored.
add_common_vars()- Calls the Apache ap_add_common_vars function. After a call to this function,
request.subprocess_env
will contain a lot of CGI information.
child_terminate()- Terminate a child process. This should terminate the current child process in a nice fashion.
get_basic_auth_pw()- Returns a string containing the password when basic authentication is used.
get_config()- Returns a reference to the
table
object containing the configuration in effect for this request. The table has directives as keys, and their values, if any, as values.
get_dirs()- Returns a reference to the
table
object keyed by directives currently in effect and having directory names of where the particular directive was last encountered as values. For every key in the table returned by get_config(), there will be a key in this table. If the directive was in one of the server config files outside of any <Directory>, then the value will be an empty string.
get_remote_host(int type = apache.REMOTE_NAME)- Returns the a string with the remote client's DNS name or IP or None on failure. The first call to this function may entail a DNS look up, but subsequent calls will use the cached result from the first call.
The optional type argument can specify the following:
apache.REMOTE_HOST Look up the DNS name. Fail if Apache directive
HostNameLookups
isoff
or the hostname cannot be determined.apache.REMOTE_NAME (Default) Return the DNS name if possible, or the IP (as a string in dotted decimal notation) otherwise.
apache.REMOTE_NOLOOKUP Don't perform a DNS lookup, return an IP. Note: if a lookup was performed prior to this call, then the cached host name is returned.
apache.REMOTE_DOUBLE_REV Force a double-reverse lookup. On failure, return None.
get_options()- Returns a reference to the
table
object containing the options set by the PythonOption directives.
read(int len)- Reads len bytes directly from the client, returning a string with the data read. When there is nothing more to read, None is returned. To find out how much there is to read, use the
Content-length
header sent by the client, for example:len = int(req.headers_in["content-length"]) form_data = req.read(len)This function is affected by theTimeout
Apache configuration directive. The read will be aborted and an IOError raised if the Timout is reached while reading client data.
register_cleanup(callable function, data=None)- Registers a cleanup. Argument function can be any callable object, the optional argument data can be any object. At the very end of the request, just before the actual request record is destroyed by Apache, function function will be called with one argument, data.
send_http_header()- Starts the output from the request by sending the HTTP headers. This function has no effect when called more than once within the same request. Any manipulation of
request.headers_out
after this function has been called is pointless since the headers have already been sent to the client.
write(string)- Writes string directly to the client, then flushes the buffer.
Other Members
The request object contains most of the members of the underlyingrequest_rec
.
- connection connection object, RO
- A
connection
object associated with this request. See Connection Object below for details.
server server object, RO- A
server
object associate with this request. See Server Object below for details.
next request object, RO- If this is an internal redirect, the request object we redirect to.
prev request object, RO- If this is an internal redirect, the request object we redirect from.
main request object, RO- If this is a sub-request, pointer to the main request.
the_request string, RO- First line of the request.
assbackwards int, RO- Is this an HTTP/0.9 "simple" request?
header_only int, RO- HEAD request, as opposed to GET.
protocol string, RO- Protocol, as given by the client, or "HTTP/0.9"
proto_num int, RO- Number version of protocol; 1.1 = 1001
hostname string, RO- Host, as set by full URI or Host:
request_time long, RO- When request started.
status_line string, RO- Status line. E.g. "200 OK".
status int, RW- An integer, whose value will be used in building the status line of the HTTP reply headers. Normally, there is no reason to change this. The correct way to provide status is to return the status code from the handler.
method string, RO- Method - GET, HEAD, POST, etc.
method_number int, RO- Method number.
allowed int, RO- A bitvector of the allowed methods. Used in relation with METHOD_NOT_ALLOWED.
sent_body int, RO- Byte count in stream is for body. (?)
bytes_sent long, RO- Bytes sent.
mtime long, RO- Time the resource was last modified.
boundary string, RO- Multipart/byteranges boundary.
range string, RO- The Range: header.
clength long, RO- The "real" content length. (I.e. can only be used after the content's been read?)
remaining long, RO- Bytes left to read. (Only makes sense inside a read operation.)
read_length long, RO- Bytes that have been read.
read_body int, RO- How the request body should be read. (?)
read_chunked int, RO- Read chunked transfer coding.
headers_in- A
table
object containing the headers send by the client.
headers_out- A
table
object representing the headers to be sent to the client. Note that manipulating this table after therequest.send_http_headers()
has been called is meaningless, since the headers have already gone out to the client.
err_headers_out table- These headers get send with the error response, instead of headers_out.
subprocess_env table- A
table
representing the subprocess environment. See alsorequest.add_common_vars()
.
notes table- A place-holder for request-specific information to be used by modules.
content_type string, RW- A string, representing the response content type.
headers_out table- Headers going out to the client.
handler string, RO- The hame of the handler currently being processed. In all cases with mod_python, this should be "python-program".
content_encoding string, RO- Content encoding
vlist_validator string, RO- Variant list validator (if negotiated)
no_cache int, RO- No cache.
no_local_copy int, RO- No local copy exists.
unparsed_uri string, RO- The URI without any parsing performed.
uri string, RO- The path portion of the URI
filename string, RO- The file name being requested.
path_info string, RO- What follows after the file name.
args string, RO- QUERY_ARGS, if any
Connection Object
Theconnection
object is a Python mapping to the Apacheconn_rec
structure.
server server object, RO- A
server
object associate with this connection. See Server Object below for details.
base_server server object, RO- A
server
object for the physical vhost that this connection came in through.
child_num int, RO- The number of the child handling the request.
local_addr tuple, RO- The (address, port) tuple for the server.
remote_iddr tuple, RO- The (address, port) tuple for the client.
remote_ip string, RO- The IP of the client.
remote_host string, RO- The DNS name of the remote client.
None
if DNS has not been checked, "" (empty string) if no name found.
remote_logname string, RO- Remote name if using RFC1413 (ident).
user string, RO- If an authentication check is made, this will hold the user name. NOTE: You must call
get_basic_auth_pw()
before using this value.
ap_auth_type string, RO- Authentication type. (None == basic?)
keepalives int, RO- The number of times this connection has been used. (?)
local_ip string, RO- The IP of the server.
local_host string, RO- The DNS name of the server.
Server Object
Therequest
object is a Python mapping to the Apacherequest_rec
structure. The server structure describes the server (possibly virtual server) serving the request.
- defn_name string, RO
- The name of the configuration file where the server definition was found.
defn_line_number int, RO- Line number in the config file where the server definition is found.
srm_confname string, RO- Location of the srm config file.
server_admin string, RO- Value of the
ServerAdmin
directive.
server_hostname string, RO- Value of the
ServerName
directive.
port int, RO- TCP/IP port number.
error_fname string, RO- The name of the error log file for this server, if any.
loglevel int, RO- Logging level.
is_virtual int, RO- 1 if this is a virtual server.
timeout int, RO- Timeout before we give up.
keep_alive_timeout int, RO- Keep-Alive timeout.
keep_alive_max int, RO- Maximum number of requests per Keep-Alive.
keep_alive int, RO- 1 if keep-alive is on.
send_buffer_size int, RO- Size of the TCP send buffer.
path string, RO- Path for
ServerPath
.
pathlen int, RO- Path length.
server_uid int, RO- UID under which the server is running.
server_gid int, RO- GID under which the server is running.
Debugging
Mod_python supports the ability to execute handlers within the Python debugger (pdb) via the PythonEnablePdb Apache directive. Since the debugger is an interactive tool,httpd
must be invoked with the -X option. (NB: When pdb starts, you will not see the usual ">>>" prompt. Just type in the pdb commands like you would if there was one.)Internal Callback Object
The Apache server interfaces with the Python interpreter via a callback objectobCallBack
. When a subinterpreter is created, an instance ofobCallBack
is created in this subinterpreter. Interestingly,obCallBack
is not written in C, it is written in Python and the code for it is in theapache
module. Mod_python only uses the C API to importapache
and then instantiateobCallBack
, storing a reference to the instance in the interpreter dictionary described above. Thus, the values in the interpreter dictionary are callback object instances.When a request handler is invoked by Apache, mod_python uses the
obCallBack
reference to call its methodDispatch
, passing it the name of the handler being invoked as a string.The
Dispatch
method then does the rest of the work of importing the user module, resolving the callable object in it and calling it passing it arequest
object.
Last modified: Mon Sep 4 14:42:13 EDT 2000