[mod_python] AddHandler / SetHandler (black magic)

Graham Dumpleton grahamd at dscpl.com.au
Sat Oct 7 22:37:39 EDT 2006


On 07/10/2006, at 9:55 PM, Norman Tindall wrote:

> GD> Using AddHandler and other Apache directives is still the better
> GD> way. In mod_python 3.3 it will be easier to do things like above,
> GD> but it shouldn't be done in the response handler phase but an
> GD> earlier phase such as the fixup handler phase. I'll be writing an
> GD> article about this specific ability of mod_python 3.3 some time in
> GD> the near future hopefully.
>
> GD> Graham
>
>     The reason why i am using this is that i am using mod_python 3.1
>     by now. I will upgrade to 3.3 as soon as possible.
>
>
>      I have a question according to you last reply to my letter.
>      Suppose i want to process thing as this:
>
>                     +-------------------+
>                     |   start of req    |
>                     +-------------------+
>                       /            \
>                      /              \
>      +-----------------------+  +------------------------+
>      | extension is:         |  |  All other extension   |
>      | ['py','html','css']   |  |                        |
>      | and without extension |  +------------------------+
>      | but there are no real |              |
>      |  files at all, they   |              |
>      | all redirected to say |              |
>      | engine.py             |              |
>      | so i don`t want apache|              |
>      | to check for exsistance              |
>      | of file               |              |
>      +-----------------------+              |
>                |                     Apache standard
>          Python handler                  Handler
>                |                            |
>      ------------------------     ------------------------
>      In some cases script           Apache generates all
>      generates error pages          error pages
>      like 404,403,500
>      But in other cases
>      i want apache standard
>      error pages.
>
>
> What is the best,simple and fastest way?
> Can i do this with just apache directives or i have to parse, uri
> and extension by myself in fixuphandler as you said?
>
>    1 import posixpath
>    2 from mod_python import apache, psp
>    3
>    4 def fixuphandler(req):
>    5     extension = posixpath.splitext(req.filename)[1]
>    6     if extension in ['py','html','css']:
>    7         req.add_handler('PythonHandler', my_handler)
>    8         req.handler = 'mod_python'
>    9     return apache.OK

For mod_python 3.2 and earlier use Apache configuration of:

   # Files which have no extension.

   <Files ~ '^[^.]*$'>
   SetHandler mod_python
   </Files>

   # Files with extension of interest.

   <Files ~ '^.*\.(css|py|html)$>
   SetHandler mod_python
   </Files>

   PythonHandler _handlers::handler
   PythonHandler _handlers::handler_css | .css
   PythonHandler _handlers::handler_py | .py
   PythonHandler _handlers::handler_html | .html

Note this is for .htaccess file, but can also be inside Directory  
directive.

If using mod_python 3.3 I might have made it more constrained by using:

   <Files ~ '^[^.]*$'>
   SetHandler mod_python
   PythonHandler _handlers::handler
   </Files>

   <Files ~ '^.*\.(css|py|html)$>
   SetHandler mod_python
   PythonHandler _handlers::handler_css | .css
   PythonHandler _handlers::handler_py | .py
   PythonHandler _handlers::handler_html | .html
   </Files>

Ie., stick the PythonHandler inside the Files directive. This will  
not work
on mod_python 3.2 and earlier though if the Python handler module
resides in the document tree. If the Python handler module is
elsewhere on sys.path, then can do this for 3.2 and earlier. The reasons
for the problems are documented at:

   http://issues.apache.org/jira/browse/MODPYTHON-126

This is fixed in 3.3 though.

With a Python handler module of:

   from mod_python import apache

   def _dump(req, extension):
     req.content_type = 'text/plain'
     print >> req, 'uri = %s' % req.uri
     print >> req, 'filename = %s' % req.filename
     print >> req, 'path_info = %s' % req.path_info
     print >> req, 'extension = %s' % extension
     return apache.OK

   def handler(req):
     return _dump(req, '')

   def handler_css(req):
     return _dump(req, '.css')

   def handler_py(req):
     return _dump(req, '.py')

   def handler_html(req):
     return _dump(req, '.html')

If this directory is accessed as '/~grahamd/handlers/' and I use that  
as the
URL I get:

   uri = /~grahamd/handlers/
   filename = /Users/grahamd/public_html/handlers/
   path_info =
   extension =

Thus handler() was called. Similarly if I use '/~grahamd/handlers/page'.

   uri = /~grahamd/handlers/page
   filename = /Users/grahamd/public_html/handlers/page
   path_info =
   extension =

If I use any of the extensions of interest, I get something like:

   uri = /~grahamd/handlers/page.css
   filename = /Users/grahamd/public_html/handlers/page.css
   path_info =
   extension = .css

   uri = /~grahamd/handlers/page.py
   filename = /Users/grahamd/public_html/handlers/page.py
   path_info =
   extension = .py

   uri = /~grahamd/handlers/page.html
   filename = /Users/grahamd/public_html/handlers/page.html
   path_info =
   extension = .html

Thus different handler gets called for each extension type.

If I use any other extension, I get a 404 not found, unless the file  
did actually
exist as a static file in which case it would be returned.

Now, there is one issue with doing it as above. That is that the Files
directive only matches the first component of the path in req.filename
appearing after the actual physical directory. Thus, if I use a URL of
'/~grahamd/handlers/subdir/page.html', I will get:

   uri = /~grahamd/handlers/subdir/page.html
   filename = /Users/grahamd/public_html/handlers/subdir
   path_info = /page.html
   extension =

and the handler for the '.html' extension will not be called as you  
might expect.

This is all to do with how Apache matches URLs against the physical
directory hierarchy and how it determines what actually constitutes the
path info for a request.

At some point you would have to deal with this yourself if you were to
take total control of mapping the URL to some target.

Anyway, to ensure that Apache does what is required in this case, all  
that
is required is to create the physical directories in the file system  
which
correspond to those directories which you want to notionally hold what
appear to be static files. Thus:

   mkdir /Users/grahamd/public_html/handlers/subdir

Now I get:

   uri = /~grahamd/handlers/subdir/page.html
   filename = /Users/grahamd/public_html/handlers/subdir/page.html
   path_info =
   extension = .html

A benefit of still relying on Apache for doing this mapping and  
creating the
subdirectories, is you could use a modified handler of:

   def handler_html(req):
     if os.path.exists(req.filename):
       return apache.DECLINED
     return _dump(req, '.html')

That is, because req.filename is valid, can simply check for the  
existence
of the file and if it exists return apache.DECLINED so that Apache  
serves
it up as a static file instead. Thus can mix static and dynamic files  
with the
static taking precedence.

Note that using os.path.exists() here results in an extra stat()  
call, but have
to do this with mod_python 3.2 and earlier as some information missing
with req.finfo. In mod_python 3.3, would instead use:

   def handler_html(req):
     if req.finfo.filetype == apache.APR_REG:
       return apache.DECLINED
     return _dump(req, '.html')

BTW, rather than modify the handler, I could have had a separate handler
whose only purpose was to check for static files:

   def check_exists(req):
     if req.finfo.filetype == apache.APR_REG:
       return apache.DECLINED
     return apache.OK

Then in the Apache configuration, use:

   PythonHandler _handlers::check_exists _handlers::handler_html | .html

or:

   PythonHandler _handlers::check_exists |.html
   PythonHandler _handlers::handler_html | .html

Both are equivalent.

Ie., use a stacked handler. That the first returns apache.DECLINED  
causes
the latter not to be invoked. This way you have little handlers doing  
specific
jobs and aren't creating one huge handler that tries to do everything.

Hope this is interesting.

Graham


More information about the Mod_python mailing list