[mod_python] Serving files via mod_python

Tue May 21 16:01:49 EST 2002

On Mon, May 20, 2002 at 09:05:31PM +0200, Hugo van der Merwe wrote:
> Hello,
> 
> I want to use mod_python to control what files can be downloaded by
> whom, and to log such file transfers (with my own code). One possibility
> is also to convert some files' contents on the fly.
> 
> I am currently using:
> 
> f = open(filename, "r")
> while 1:
>   data = f.read(1048576)
>   if data = "": break
>   req.write(data)
> 
> It seems to work fine. I just wonder if this is bad (performance wise, I
> wonder what the best "size" would be, instead of 1M). These are
> potentially *large* files I'm dealing with, things like cd images.

Transferring 1MB at a time is probably about as efficient as you can 
get; of course, you'd need to do some proper benchmarking to be sure. I 
believe that Apache defaults to Chunked transfer-encoding, so every 
time you call req.write(), the data goes out immediately, with about 10 
bytes of overhead (actually ln(n)/4+4 bytes).

Actually, most of your overhead will be in TCP and IP encoding, as each 
call to write() will probably result in about 600 IP packets being sent 
(and acknowledged), and there's not much you can do about that.

> At a
> later date I guess I'll have to figure out how to do download resuming,
> etc. I'm sure Apache does a better job of this, maybe I should rather
> let Apache handle it,

Apache will only handle Range requests (which are what you need for 
download resuming) for static files. If you are using mod_python to 
serve the content, then you will have to handle the Range headers 
yourself.

> how can I do this while controlling it all from
> modPython? I guess I could just define a PythonAuthenHandler, and let
> Apache do the actual serving? Does the PythonAuthenHandler have full
> access to just about everything a normal handler would? (URL, etc.)

That's a good way to do it, assuming that the files are already on the 
filesystem, and not being stored in a database or ripped in realtime. 
What you are looking for, though, is a PythonAuthzHandler 
(authorization handler). This is the handler which gets called *after* 
the user's name and password have been verified, (by a .htpasswd file 
or some other method,) and decides whether that user is allowed to 
download that file.

> 
> If there was some way to give Apache a file handle or something like
> that, from which it should serve the data...

You don't even need to do that; all you have to do is install the 
PythonAuthzHandler, and let Apache handle serving the actual content. 
The AuthzHandler will act like a 'filter' on an otherwise normal 
request/response.

Hope that helps,

Ian
<ian at veryfresh.com>