[mod_python] CGI to mod_python -- what's the best way?

Victor Muslin victor at prodigy.net
Fri Apr 13 02:35:02 EST 2001


At 10:18 AM 4/12/2001 +1000, you wrote:
Wouldn't it be better just to buffer the output and then just use one 
output to send
the entire buffer ? Isn't that what Grisha was suggesting in the first place ?

yes, this is what he is suggesting and exactly what I was trying to avoid 
in the first place. Also, while not being a CGI/Web guru, I can imagine a 
number of potential problems with having to buffer all of the output before 
sending it to the browser. First of all the output may be large and it may 
not be practical to assemble it in memory. Second it may be useful to allow 
browser to start rendering some output before it is completely created if 
it takes a long time to create dynamic output. Third, if the browser 
cancelled the request the only way to find out is to try to send the reply 
and get some sort of bad status. Imagine a script that does a set of 
time-consuming database queries to create the output. it would be useful to 
test whether the request was cancelled after each query by attempting to 
send something back (a space character perhaps) to see whether the socket 
is still open before doing the rest of the queries. Perhaps somebody could 
suggest how these scenarios could be handled with mod_python?

I don't understand the problem with redirecting stdout. My understanding of 
Mod-Python
is that it just keeps the intertpreter running so it reduces start up time. I
understood that each CGI session is still, as always, a separate session 
that fires up
it's own instance of the code. Is this not true ?

I am using the "publisher" capability of mod_python. This is how I imagine 
it works (not having had the time to go through the code). There is a 
function -- call it handler() -- that handles requests. Let's say there are 
two identical concurrent requests. Both are handled by the same instance of 
Python interpeter that calls handler() for each one. All variables 
instantiated inside the handler() function are local to the function and, 
therefore, each request has its own instance of these variables. Variables 
that are module level (of the module where the handler() function is) of 
class-level are global, i.e. there is one instance of them in the 
interpreter and, therefore, they are shared by the requests. If this 
weren't the case you couldn't open a database connection once, for example, 
and keep it open instead of re-opening it for each request. sys.stdout 
happens to be a global variable and, therefore, shared by multiple 
instances of handler() and consequently by the requests. If code in one 
request reassigns it, the code in the other concurrent requests is affected.

Secondly the idea of just rewriting something that works doesn't seem to be 
a good idea
to me. I would say rewrite if you're finding you are doing a lot of 
mainatainence on
existing code however if the code works well and has been tested, deployed 
etc, it
would be better to interfere with it as little as possible unless you 
already know that
current requirements will make a rewrite inevitable at some stage in the future
(obviously I'm not just talking about a few lines of python here ).

I think you are making my point here. I did not want to re-write anything.


Wilson

"Gregory (Grisha) Trubetskoy" wrote:

 > Victor -
 >
 > Rather than invent ways to deal with legacy CGI code, I would bite the
 > bullet and rewrite the code without the use of "print". There are too many
 > subtle gotchas with simulating CGI...
 >
 > Grisha
 >
 > On Sun, 8 Apr 2001, Victor Muslin wrote:
 >
 > >
 > > Sorry for a long message, but this requires a bit of explanation. I
 > > appreciate your patience in advance.
 > >
 > > I have a bunch of python legacy code that used to be part of a large
 > > CGI-based system. This code simply used print statements to output HTML as
 > > follows:
 > >
 > > def foo():
 > > print 'html1'
 > > print 'html2'
 > >
 > > Now I want to convert CGI to mod_python, but I would like to re-use the
 > > legacy code with as little re-writing as possible (obviously the legacy
 > > code is a lot lengthier and more complicated than the example above). I am
 > > using the publisher module, which requires my code to return a string
 > > containing all of the HTML. So I thought I would be clever and do 
something
 > > like this:
 > >
 > > import sys, cStringIO
 > > def handler(req):
 > > out = sys.stdout = StringIO()
 > > foo()
 > > return out
 > >
 > > This works great as long as the second request does not arrive before the
 > > first one is done. Otherwise, the output gets screwed up. Since "out" is a
 > > local variable, each request has its own instance, but sys.stdout is a
 > > global. When the second request arrives, sys.stdout gets reassigned 
and the
 > > rest of the output produced by print statements in the foo() function goes
 > > to the new StringIO object. For example, if the second request arrives and
 > > gets executed between the two print statements of the first request, then
 > > the first request's output could be 'html1\n' and the second request's
 > > output could be 'html2\nhtml1\nhtml2\n'.
 > >
 > > Has anyone dealt with such a situation? Any clever suggestion would be
 > > appreciated as I hate to have to go into all the legacy code and change it
 > > to something like this:
 > >
 > > def foo():
 > > out = 'html1\n'
 > > out = out + 'html2\n'
 > > return out
 > >
 > > def handler(req):
 > > return foo()
 > >
 > > Thanks in advance.
 > >
__________________________________________________________________________________
Victor Muslin      The power of accurate observation is frequently called
                          cynicism by those who don't have it.
                                       - George Bernard Shaw
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.modpython.org/pipermail/mod_python/attachments/20010413/96c1a4d3/attachment-0003.htm


More information about the Mod_python mailing list