[mod_python] Background threads in mod_python

Mike Looijmans nlv11281 at natlab.research.philips.com
Tue Aug 1 01:33:08 EDT 2006


> There is lots of needs for this in an Ajax world. For example Jetty 
> (http://www.mortbay.com/MB/log/gregw/?permalink=ScalingConnections.html), 
> Apache 2.2, and IIS a few years can do asynchronous processing. There 
> are two ways to do asynchronous processing. The first is to lock onto 
> the request and hold it and move it into a "secondary" processing area. 
> The second is to start a task and then ask if any data has been 
> generated. The running task will store in a cache that is then picked up 
> by another request at another time.

And you have to ask yourself whether a WEB application is really what you want. Ajax is a framework 
for building non-web applications in browsers.

>> When Apache is using multiple processes, it will terminate child 
>> processes for various reasons (for example, when 
>> max_requests_per_child has been reached). That will also terminate any 
>> thread you created in that process.
>>
> Yeah I was looking at this and it is pain. On Windows Apache does not do 
> this. They use only two processes and use threads within those processes.

Even more painful, there is no telling to which process your next request will go. So if you start a 
thread, and even if it survives, it will be an impossible task to reach it on the server, because 
your requests might end up at other processes.

> I would not say "job". I would say long running tasks generating data. 
> For example I like to read real-time feeds, and want to generate the 
> data. But the client is in control of the task using parameters that are 
> sent to server.

So, why not just let the client wait for the data? Neither server nor client have any objection 
keeping a connection open for an hour or longer. Some proxies will appreciate it if you send some 
data every now and then.

>> A web service used within a web server is not redundant, it's just a 
>> way to delegate tasks to other machines or processes. This is 
>> typically done for security reasons.
>>
> I don't really buy this argument. Let's say I do what you recommend and 
> that is call an XML-RPC service. Why, do I need Apache in the first 
> place? While I might get security I don't get any added value. If I am 
> using web services then Apache is pretty well useless anyways because 
> most web services don't use HTTP security. Most web service 
> infrastructures use the WS-* specs, or some home-backed tokens that are 
> added to the XML package.

There's more to security than authentication.

In a simple order database for example one might want to restrict access so that customers can only 
view their own orders. In DBMS level, this restriction is impossible to enforce (you cannot put 
access restrictions on rows), but when using an inbetween layer it is easy to implement.

Further, by restricting access to the DB so that only the "business layer" web service can access 
it, it is impossible to retrieve sensitive information from the system, even if the web server is 
compromised.



More information about the Mod_python mailing list