[mod_python] apache API and mod_python <-> protocol module integration

Dustin Mitchell dustin at ywlcs.org
Wed Jul 16 13:25:15 EST 2003


On Wed, Jul 16, 2003 at 01:59:58PM -0400, Gregory (Grisha) Trubetskoy wrote:
> This is not a problem of mod_python or Apache, this is the way sockets
> work. If your protocol is such that you know ahead of time how much to
> read (e.g. HTTP), then you can request this much data in one read.
> Otherwise, you must read in smallest unit sizes of your protocol, which
> (I'm guessing) is 1 byte in case of Z39.50.

This is not strictly true, as long as you're OK with possibly getting a
bit of *extra* data beyond a particular PDU (a little bit of buffering
can take care of that).  Apache has to do this to read the HTTP header,
which is not of a fixed length.

The socket blocking read interface is structured such that it will block
when *no* data is available, and will return *up to* the quantity
specified in the recv() call if it is available.  However, if some
smaller amount of data is available, that data will be returned
immediately.

To demonstrate, I hacked this together:

from socket import *
s = socket(AF_INET, SOCK_STREAM)
s.bind(("", 9900))
s.listen(5)
while 1:
  ss, sa = s.accept()
  print "Connect from '%s'" % (sa,)
  while 1:
    data = ss.recv(100)
    if not data: break
    print " Received: %s" % `data`
  print "Connection closed from '%s'" % (sa,)

Then connected with telnet and typed "Hello" and "World" on separate
lines.  Telnet, by default, buffers lines of input until a newline, but
you will note that in the two returns from ss.recv(), neither contained
100 bytes of data:

Connect from '('127.0.0.1', 56260)'
 Received: 'Hello\r\n'
 Received: 'World\r\n'
Connection closed from '('127.0.0.1', 56260)'

Further, a loop such as:
  for i in 'hello world': s.send(i)
will produce output like this:

Connect from '('127.0.0.1', 56277)'
 Received: 'h'
 Received: 'ello world'
Connection closed from '('127.0.0.1', 56277)'

Either the client or the server's socket implementation is buffering the
characters because they're appearing so quickly, which cuts down on the
number of recv() calls which must be made.

I can't speak for the Apache sockets API, or the Python interface
thereto, but sockets themselves *do* support this interface (they'd be
pretty useless if they didn't!).  I highly recommend W. Richard
Stevens' TCP/IP Illustrated for an in-depth and very readable
discussion of this and many other crucial implementation details of
TCP/IP.

Dustin

-- 

  Dustin Mitchell
  dustin at ywlcs.org/djmitche at alumni.uchicago.edu
  http://people.cs.uchicago.edu/~dustin/


More information about the Mod_python mailing list