[mod_python] Input filter

Tue Dec 26 17:09:29 EST 2006

On 26/12/2006, at 11:40 PM, export at hope.cz wrote:

> Graham,
> Thank you for your reply.
> I studied the example  at http://www.modpython.org/pipermail/ 
> mod_python/2006- April/020870.html
>
> Now I understand a little more.
>
> There an author uses
> streambuffer = filter.req.streambuffer
>
> So, I think that streambuffer is a method of request(req) object.
> Am I right?

No.

An important thing to know about filters is that the filter function can
be called more than once for the same request. That is, it processes  
data
in chunks.

Also, for each of those calls of the filter function, although the  
'filter' object
is a distinct object each time, the 'filter.req' object is the same  
for all requests.
As a consequence, it is possible to cache data within the  
'filter.req' object
so that it is available across invocations of the filter functions.

With that in mind, look at the code again and you will see that  
within the
'try' block the filter function tries to access  
'filter.req.streambuffer'. That will
fail on the first call of the filter function for the request as that  
attribute will
not exist. As a result, the 'except' block is run which initialises the
attribute with an instance of a StringIO object.

Having ensure that the attribute exists, on each call to the filter  
function
the chunks of data are added to the StringIO object instance until no  
more
data is found. At that time the data in the StringIO object is turned  
back
into one string and processed.

In output filter examples all processing of the data is done in  
memory. This
isn't want you want, so you would instead substitute  
'filter.req.streambuffer'
with an open file handle and thus write each chunk of data out to  
disk and
close the handle when done.

Writing out to disk was what Tramline was doing, with it then only  
passing
through a header to subsequent request handler containing a value to
identify the file on disk.

> Is there a description of  all request (req) methods.I could not  
> find it in mod_python doc.

Those you need to know about are:

   http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html

If you can't find it in there, it probably isn't actually an  
attribute or method.

> Is there any advantage to use
> filter.req.streambuffer
> instead of
> filter.read()  ?

Read over description above and check code again. The 'filter.read()'  
method
is still used, with data being accumulated in  
'filter.req.streambuffer' as
described.

> Yet, after studing the filter examples,
> I do not understand how to verify that input data( from POST  
> request)  is written directly to disk and not loaded to RAM first  
> and only after  the whole file is uploaded then write to disk
>
> Can you please explain?

You really need to dig through the code examples and understand how
filters work. If you can work that out and see where the data is  
going you
should be able to work that out yourself. Obviously the output filter  
examples
you are using aren't even opening files on disk to write to so  
nothing can
be getting to disk in those.

BTW, did you investigate the other option I pointed out in the  
response to
your first email some time back. Ie., the file callbacks in  
FieldStorage class?
Or are you trying to use mod_python.publisher and was that not an
option?

The examples how to use FieldStorage so as to intercept uploaded files
and have them sent direct to disk can be found at:

   http://www.modpython.org/live/current/doc-html/pyapi-util-fstor- 
examples.html

BTW, are you the same person as 'Lad' posting on comp.lang.python
asking about this? I don't want to be answering in two different  
places if
you are the same person.

Graham

> Thank you
> La.
>
>
> >
> > I specifically said in my email:
> >
> >    Just because Tramline is intended more for cases where Apache is
> >    front ending a backend application doesn't mean it isn't a valid
> > example
> >    of input filters, you would just need to adapt the ideas to your
> > needs.
> >
> > So I already said that it isn't exactly what you require. What you
> > should do
> > is read through the code and the mod_python documentation and learn
> > from it so you can work out how you can customise it to do what you
> > want.
> > You do not need to be using mod_proxy to use input filters,  
> Tramline was
> > using it for specific reasons of its own.
> >
> > The relevant part of the mod_python documentation is:
> >
> >   http://www.modpython.org/live/current/doc-html/pyapi-filter.html
> >
> > The example is for an output filter, but input filters work almost
> > the same.
> >
> > Also look back through past mailing lists post by using search  
> box on
> > the
> > mod_python web site. A slightly more complicated example of an  
> output
> > filter is:
> >
> >    http://www.modpython.org/pipermail/mod_python/2006-April/ 
> 020870.html
> >
> > Overall though, the Tramline code is the most complicated example  
> of an
> > input filter I have seen.
> >
> > Graham
> >
> > >>> Can anyone give me an example of an Input filter?
> > >>> I would like to check if a  file being uploaded is large and  
> if so,
> > >>> I would like to write the file
> > >>> directly on hard disk not to memory.
> > >>
> > >> Did you look at Tramline as I suggested the last time you  
> asked about
> > >> file uploads?
> > >>
> > >> Tramline is an input filter and also seeks to solve the  
> problem of
> > >> how
> > >> to handle large file uploads.
> > >>
> > >> The URL for Tramline is:
> > >>
> > >>    http://www.infrae.com/download/tramline
> > >>
> > >> Just because Tramline is intended more for cases where Apache is
> > >> front ending a backend application doesn't mean it isn't a valid
> > >> example
> > >> of input filters, you would just need to adapt the ideas to your
> > >> needs.
> > >>
> > >> Graham
> > >
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20061227/4d7da990/attachment-0001.html