[mod_python] streaming zipped data with publisher

Graham Dumpleton grahamd at dscpl.com.au
Sat Jan 7 20:15:02 EST 2006


On 08/01/2006, at 8:55 AM, Sean Jamieson wrote:

> Sure it is, even with the temporary file option, all you have to do is 
> set the content-type header to application/x-gzipped or something like 
> that (application/octet-stream is always a fallback value; this forces 
> download, and doesn't allow it to recognize the file format for use 
> with an external application), then output the file (possibly setting 
> the Content-Length header first, to be polite)
>
> with or with out the file is your choice, all you have to do is output 
> the binary data, and no other output (not even a single space) or you 
> will corrupt it.
>
> Daniel Nogradi wrote:
>
>> Is it possible to "stream" zipped data using the publisher handler?
>>
>> In my situation I would create the zipped archive upon client input
>> and want to return the archive immediately. What I have in mind is
>> creating a zip archive (using the zipfile module perhaps) in memory
>> and than sending that to the client without actually creating a zip
>> file on the disk of the server.
>>
>> In case this would not be possible I could live with creating a
>> temporary file and returning that to the client, but at the moment the
>> only way I can do this is through a link (such as <a
>> href='data.zip'>click here</a>)  but in this case it would be a 2-step
>> process: (1) user clicks on something which creates the archive and
>> sends back the link in html (2) user clicks on the link. Is it not
>> possible to send the zip file immediately after its creation?

I am not sure you are talking about the same thing here. The "zipfile"
module is for creating an archive containing other enclosed files.
This is different to merely compressing any output returned from a
handler.

If it was the case that it is only about compressing the output, doing
it inside a published function is possibly not a good idea anyway. A
better way would be to use the DEFLATE output filter as implemented by
the Apache mod_deflate module.

This module is however only available in Apache 2.0 and is not compiled
in by default, so you have to enable its inclusion when "configure" is
run prior to building Apache. Also, at the moment, because mod_python
doesn't allow dynamic registration of output filters from a handler,
it would apply to all requests in a directory, or those returning a
certain type of output, or possibly by URL if you are tricky.

Also should be pointed out that mod_deflate will only compress output
where the client says it is willing to accept it as gzip'd data by
having set the "Accept-Encoding: gzip" incoming header. One can forcibly
set this though from a handler to force it to do it anyway.

Anyway, the point I am trying to make is there are perhaps better tools
for doing the job than trying to compress on the fly in Python code 
running
under mod_python, although technically one could write the equivalent of
mod_deflate in Python if you wanted to. If you did write it in Python
though, you would still be better off installing it as an output filter.
For it all to work nicely may have to wait until mod_python 3.3 though,
where I hope to get some improvements made to working with filters.

   http://issues.apache.org/jira/browse/MODPYTHON-103
   http://issues.apache.org/jira/browse/MODPYTHON-104

Anyway, back to the original posters question. If I read the code for 
the
"zipfile" module correctly, when you use the ZipFile class with "w" mode
to create a zip file, the first argument only needs to be a file like
object. That is, it doesn't need to be a actual file on disk. Instead it
could be an instance of the StringIO class. I have not use ZipFile to
create an archive, but something perhaps like the following may work.

   import zipfile
   import StringIO

   buffer = StringIO.StringIO()
   myzip = zipfile.ZipFile(buffer,"w")

   ... build up zip

   myzip.close()

   data = buffer.getvalue()

After this, your zip file as a string will be in data and you can then
return it as the response content.

Graham



More information about the Mod_python mailing list