[mod_python] Protecting Web apps from to many simultaneous clicks/Hacking

Fri May 14 13:39:48 EDT 2004

On Fri, 2004-05-14 at 04:16, SAiello at Jentoo.com wrote:
> I was curious for ideas on how to protect a mod_python web application from 
> someone submitting/requesting data very quickly repeatedly. An example, I am 

If you mean, 'how do I protect against someone maliciously trying to
overload my server' then the throttling/bandwidth limiting suggestions
already given are useful tools.  If you mean, 'how do I improve my
server's performance to handle high load' then read on.

> building an IMAP webmail application. Currently, if I click the view 'next 
> set of messages in email box' quickly over and over again, that seems to 
> spawn a bunch of apaches trying to service all those requests. One problem is 
> that I really don't want one user being able to make my app take up alot of 
> CPU load by doing this. Another is that I am storing the current message 
> position in a session variable, by spawning a bunch of simultaneous requests 
> I seem to be able to keep clicking 'next' above the total number of messages.

Use caching.  If you've only just asked the IMAP server for the contents
of message #404, there's no good reason to ask it again.  You could
cache the messages, the message indexes, or even the entire output of a
given request.

Also, you probably shouldn't be storing the 'current message position'
in a session.  This implies that the user is only viewing one page at
once, which in a lot of cases isn't true.  They might open a message in
a new window or tab, for instance, and have two messages open at once. 
Which one is 'current' in that situation?

A possibly better way would be to have the "Next" link, as generated in
response to a request to display a particular message, also include
information about which message should be considered the next message. 
For example, I would probably implement this as a method to display a
message by ID, and for each generated display, include a "Next" button
which generates a request to display message #(ID + 1).

If the functionality of "Next" means "Next Unread" or some such, I'd
probably generate a request to display the next unread message after
message #ID, so once again, the knowledge about the 'current' message is
tied to a particular display.

Another, more serious, problem is that you appear to have a race
condition.  One request might be getting the 'current' message ID,
comparing it to the maximum, then incrementing the session value. 
Another request does the same.  Unfortunately, due to the way
multiprocessing works, one of them preempted the other, and did its work
between the "compare" and the "increment" commands.  Thus, the first
increments the value again, making it too high.

This is serious because my understanding of mod_python.Session is that
it automatically does session locking.  In other words, there should
already be only one simultaneous request per session.

> A quick idea of mine to limit one simultaneous request per session, was at the 
> start of the request, create a session variable that would store the total 
> number of requests for that session. Then I could check the number of 
> requests, and if the variable is greater than 1, sleep until it is lower than 
> 1.

... so yes, the general idea is sound.  Your implementation is a little
flawed, however.  

> 1	sess=Session.Session(req, None, cookieSecret)
> 2	if not sess.has_key('REQUESTS'):
> 3		sess['REQUESTS']=1
> 4		sess.save()
> 5	else:
> 6		sess['REQUESTS']+=1
> 7		sess.save()
> 8		while sess['REQUESTS']>1:
> 9			sleep(1)
> 10	sess['REQUESTS']-=1
> 11	sess.save()

I've added line numbers to help the discussion.  So, you create a
session object at line 1.  This is when the locking should already have
occurred.  In lines 2-4, you introduce a race condition: if a second
process preempts your request after line 2, but before line 4, that
process will also get False from sess.has_key('REQUESTS').  This means
two separate processes will reach line 3 thinking they have exclusive
access to the session.  A similar race condition exists between lines 6
and 7.

More problematic, lines 8 and 9 loop until sess['REQUESTS'] <= 1. 
Unfortunately, you didn't refresh the session in that loop, so I would
expect any request entering that loop will never leave it.  You may need
a "sess.load()" in the loop.

Finally, at line 10 you decrement your own, local value, and save that
to the shared session.  This would immediately overwrite any other value
there.  Granted, if you had achieved a lock by this point, what's in
there would be what you expected to be in there.

For what it's worth, a quick test with Apache/2.0.47 (Debian GNU/Linux)
mod_python/3.1.3 Python/2.3.3 shows that "s = Session.Session(req, None,
'foobar')" does in fact do session locking.

In the context of the original request, I'd start by reducing the
response time of each request before I started finding ways to deny
excessive requests. :)

-- 
bje