[mod_python] Re: Chunked encoding

Sat Feb 18 19:09:03 EST 2006

On 19/02/2006, at 9:14 AM, Dan Eloff wrote:

> I'm curious about this "chunked encoding" what does that mean? I  
> output all my pages in my handler as a single string, so I have a  
> sort of interest in understanding what the write method does. If  
> I'm to take a wild guess at how the below works, you set the  
> content length so that mod_python realizes when it recieves your  
> data string that no more is coming and it doesn't need to copy the  
> string into a buffer, it can just keep a reference to it? And  
> somehow the act of buffering the data as opposed to writing it  
> immediately causes mod_python not to emply this "chunked encoding"?
>
> It'd be great if someone explained this to me :)
>
> Thanks,
> -Dan
>
> On 2/15/06, Lars Eriksen <downgrade at gmx.org > wrote:Yes, there  
> is :-) RTFM, I guess ...
>
> from mod_python.util import *
> from mod_python import apache
>
> def handle(req):
>     data = 'No chunking involved.' * 1024
>     req.set_content_length(len(data))
>     req.write(data, 0)
>     req.flush()
>     return apache.OK

As far as I can tell, there is no difference between that and:

def handler(req):
     data = 'No chunking involved.' * 1024
     req.set_content_length(len(data))
     req.write(data)
     return apache.OK

When no second argument is supplied to req.write(), it automatically  
flushes.

In this particular case, since there is only one call to req.write(),  
it is also
probably equivalent to:

def handler(req):
     data = 'No chunking involved.' * 1024
     req.set_content_length(len(data))
     req.write(data,0)
     return apache.OK

That is, returning apache.OK is going to have the same effect of  
flushing the
buffered data if there is any.

FWIW, chunked encoding is only of relevance to HTTP/1.1 clients. You  
as the
provider of the handler does not really have to care about it as  
Apache will
worry about it and use it if the client is capable of handling HTTP/ 
1.1. That is,
mod_python doesn't do anything about chunked encoding either.

As to the connection between req.set_content_length() and how much data
you write, there isn't really any. Calling req.set_content_length()  
only has the
effect of setting the "Content-Length" response header. You could set  
the
content length to a small value and write more data that you say  
there will be
and mod_python/Apache will quite happily let you do it. A remote  
client on
the other handle may well probably discard any extra content if it  
honours the
content length header.

For most people, how many times you call req.write() and whether you use
buffering isn't going to matter one bit. Some may want to minimise  
the number
of calls to req.write() or use buffering for perceived performance  
reasons (right
or wrong).

One valid reason for not having req.write() flush data automatically  
though is
if you are using an output filter that wants to see the whole data in  
one go. An
example is the "CONTENT_LENGTH" filter. If you had configured your  
handler
output to go through this filter, you could just say:

def handler(req):
     data = 'No chunking involved.' * 1024
     req.write(data,0)
     req.write(data,0)
     ....
     return apache.OK

and the output filter would add the content length header for you  
automatically.
For this to work, you have to ensure that flush isn't called  
explicitly or implicitly.

What are the underlying concerns that make you think you need to  
understand
this better?

Graham