[mod_python] streaming tar and/or zip files?

Mike Looijmans nlv11281 at natlab.research.philips.com
Fri Jul 6 01:21:59 EDT 2007


Yes, you have to set the number of bytes to be written, because the tar file header absolutely must 
contain that information. You can't cheat there. If you don't know the file size in advance, i think 
you're stuck with either inventing your own archive format, picking some other method of sending the 
data, or explaining some more of what you are trying to accomplish so that we may help you with that.

Mike Looijmans
Philips Natlab / Topic Automation


Matthew Dennis wrote:
> Can you please give an example because I don't see a way to do it from 
> looking at the docs.  It looks like there are only two methods for 
> getting data into the stream: add() or addfile() and both require the 
> data being sent to be in actual files, neither appear to let me stream 
> data to them.  I can't use add() because it expects the name of an 
> actual file to add.  I can't use addfile() because the fileobj that is 
> passed in requires TarInfo.size to be set to the number of bytes to be 
> read and even if that wasn't the case, it tries to make calls using the 
> name of the file (which again doesn't exist).
> 
> 
> On 7/3/07, *Mike Looijmans* < nlv11281 at natlab.research.philips.com 
> <mailto:nlv11281 at natlab.research.philips.com>> wrote:
> 
>     The "tarfile" built-in Python module can do as you ask, and also do
>     compression (gzip or bzip2). ZIP
>     cannot be "streamed", it has a sort of directory table inside to
>     which it jumps up and down.
> 
>     You can just open a tarfile object with the req object as file output.
> 
>     Setting content-type to "application/x-tar" or something similar
>     will help your client figuring out
>     what to do with the file.
> 
>     I've used the tarfile module with archives over 800GB in size,
>     without problems.
> 
> 
>     Mike Looijmans
>     Philips Natlab / Topic Automation
> 
> 
>     Matthew Dennis wrote:
>      > This may not be the appropriate list, my apologies if that is
>     indeed the
>      > case...
>      >
>      > I need to push from mod_python a file archive of sorts.  Any of tar,
>      > tar.gz or zip will work (I'm also open to other suggestions).  The
>      > requirement is that I can package several other "files" into it,
>      > compressed or not.  I have no disk to write to and finite memory
>     to deal
>      > with but the "files" I'm outputting are quite large (much larger than
>      > can fit in memory), thus I need to stream them directly out.  The
>     reason
>      > I put "files" in quotes is because they are not really files, but
>      > dynamic content that is generated on the fly.  ZipOutputStream
>     from Java
>      > is a example of what I'm talking about if anyone is familiar with
>     it, as
>      > are other such streams.  I don't know the size of the content until
>      > after it is all generated so there is no way to populate a TarInfo
>      > object with the size et cetera and make my dynamic content look
>     like a
>      > file (which it doesn't seem I can do anyway as gettarinfo() tries to
>      > call .fileno() which doesn't exist on any file like object I
>     would create).
>      >
>      > In other words, something like:
>      >
>      > output = TarFileStream(someFileObj)
>      >
>      > output.addEntry("file_1.txt")
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      >
>      > output.addEntry("file_2.txt")
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      >
>      > ...
>      >
>      > output.addEntry("file_N.txt")
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      > out.write( /*some dynamically generated content */ )
>      >
>      > output.close()
>      >
>      >
>      >
>     ------------------------------------------------------------------------
>      >
>      > _______________________________________________
>      > Mod_python mailing list
>      > Mod_python at modpython.org <mailto:Mod_python at modpython.org>
>      > http://mailman.modpython.org/mailman/listinfo/mod_python
>     <http://mailman.modpython.org/mailman/listinfo/mod_python>
> 
> 
> 




More information about the Mod_python mailing list