[mod_python] refreshing status-page / long-running process

Graham Dumpleton graham.dumpleton at gmail.com
Thu Jul 16 07:13:45 EDT 2009


2009/7/16  <shashy at web.de>:
> Hi all,
> I'm on my way of setting up a web server using mod_python. The user fills out a form, the data is passed to a Python script, the result is presented to the user, done -- sounds pretty simple.
> In the first version, I used <form action="run.py" ...>. In the run.py script I get the parameters
> def index(req):
>  params = req.form
>
> and write something like
> req.content_type = "text/html"
>  req.write('''<html>
>  <head><title>Your request is being processed.</title>
>  ...'''
>
> After some preprocessing the data is finally passed to the actual pipeline:
>    pipe = apache.import_module(pipeline.py")
>    pipe.runme(some,params,here, req)
>
> This script sends status messages to req in between all the calculations it has to do. In the end the user gets a link to his result directory and the run.py script finishes with return.
>
> That worked fine for the start, but it seems a bit awkward that the page is loading during the whole time the process needs to finish (~10 min). Mac's Safari by default has a time-out after 1 min, then it stops loading, so I had to come up with something different.
>
>
> The next idea was to use a system call appending an ampersand -> not to wait for the process to finish:
>  os.system("python pipeline.py %s/ &" % jobdir)
> With the downside, that I have to turn all the Python list and dict objects into text or use the pickle module to pack them and load them again in the script. Also I cannot send status messages from the long-running process.
>
>
> After that I tried to use os.fork() but ended up with a lot of processes that where not finished properly and high CPU usage.
>
> I read something about double-fork/detaching/daemonizing, but did not entirely get it and think there must be a simpler way to do it.
>
> It's quite common to have a page refreshing every 10 seconds, checking for new results or status messages, while the time-consuming job is running. But I could not find a simple tutorial how to do that with mod_python. Can someone help me out? I'd be very happy. Sorry, if I wrote too much...
>
> I read some similar (but not quite) questions, but all suggestions on them seem to have their own problems. But it were mostly old threads and maybe something happened in the last years, so there is a safe and feasible way to do it.
>
>
> -- One last 'bonus' question, if the user decides (based on the intermediate results that are presented to him) that he does not want to wait for the analysis, he might leave the page or start over with a new analysis using different parameters. Is it possible to stop the running script then?

Push the work to a backend server using an XML-RPC interface. The
request should just queue the work and return immediately. A separate
thread within backend server would then process requests from the
queue. For status, you make subsequent XML-RPC requests to gauge
status. Subsequent requests could cancel an in progress job if you
provide a way of signalling a worker thread that it should stop.

Graham



More information about the Mod_python mailing list