[mod_python] For what session locking is? Do i need it while using MySQL?

Jim Gallacher jpg at jgassociates.ca
Thu Aug 10 09:01:06 EDT 2006


Norman Tindall wrote:
> Hello,
>    Hmm. i can`t get session model in mod_python
> Is it something like a pool of session or a shared object? Is potentialy any thread can
> access any session and they locked in a time while thread handles
> request?

Basically it's a shared object. Depending on which session class you are
using, the actual object may be a dictionary in memory (MemorySession),
a row in a dbm file (DbmSession) or a file on the file system
(FileSession). A request for a particular session id (which may be in a
thread or a separate process, depending on your apache mpm), must
acquire a lock for that session before it can proceed. This ensures that
only one request can read or write data for that session at time.

>    I am writing a simple MySQL session module,
> in MySQL i can set SESS_ID solumn as UNIQUE and do something like this
> 
> try:
>          c.execute("insert into sessions ...")
> except MySQLdb.IntegrityError:
>        #  here i catch duplicate entries
> 
> Speed is a factor so.. what would be faster.. locking
> with _apache._global_lock(self._req.server, self._sid) OR
> MySQLdb.IntegrityError ??

My guess is acquiring the lock will be faster, but it doesn't really
matter as using MySQLdb.IntegrityError does not do what you want. It
only makes sure that you don't have duplicate session ids in your table.
It does *not* protect your session data from simultaneous access by
different requests.

> documentation has only one row about locking
> "When locking is on, only one session object with a particular session id can be instantiated at a time."

Well, that's pretty much all there is to it. :)

> Could anyone give me a simple Entity Diagramm showing relations
> between  Session -  Apache threads - Random generators used in
> Sessions.py (they are also pooled for the numbers of MPM?)
> and also more detailed info of what happens when session lock aquired

Sorry, no pictures, but I can give you words.

Here is a use case that may help clarify what is going on. Each morning
I visit my favourite news site which can be a little slow at times. I
like to scan the headlines and when I find something interesting I open
that link in a separate browser tab. While that story is loading I
continue perusing the headlines, opening stories as I go. As as result I
might have 5 or 6 pages loading simultaneously.

Now let's suppose the news site whats to keep track of the number of
stories I've read in a simple hit counter, which is stored in a session
for my visit. The code snippet might look like this:

def handler(req):
    session = Session.Session(req, lock=whatever)
    try:
        session['hits'] += 1
    except KeyError:
        session['hits'] = 1

    ... generate the page, do some processing, whatever ...

    session.save()
    return apache.OK

First, consider the possible behaviour if we don't lock the session
(lock=0). The first page (A) is complicated and it takes a while to
reach the session.save() code, while the second page (B) is quite simple
and so returns quickly.

1. Request 'A' arrives and 'hits' is incremented to 1.
   'A' starts to render the page, but it's complicated and takes
   some time.
2. context switch
3. Request 'B' arrives and 'hits' is incremented to 1.
   'B' renders the page and runs session.save().
   The request is complete.
   The 'hits' stored in the session data store is 1.
4. context switch
5. 'A' finishes rendering the page, and runs session.save().
   The hits stored in the session data store is 1.

Oops. I've visited 2 pages, but the hit counter says I've only visited
1. This is called a race condition (whichever request saves first wins)
and is one of the things mutexes are meant to fix.

Now consider the above with lock=1 (the default for Session).

1. Request 'A' arrives and tries to acquire the session lock.
   It succeeds, so processing continues.
   'hits' is incremented to 1.
   'A' starts to render the page, but it's complicated and takes
   some time.
2. context switch
3. Request 'B' arrives and tries to acquire the session lock.
   It fails to get the lock, so it can't proceed. (ie, it blocks).
4. 'A' finishes rendering the page, and runs session.save().
   The request is complete and the session machinery automatically
   unlocks the session.
   The hits stored in the session data store is 1.
5. context switch
6. 'B' now acquires the session lock.
   It renders the page and runs session.save().
   The request is complete and the session is unlocked.
   The 'hits' stored in the session data store is 2.

The saved number of hits is now 2, which is correct. Notice that I
didn't mention anything about mutexes. That's really an implementation
detail hidden from view in Session.py. We choose to use mutexes for
session locking, but there are other possible mechanisms. The point is
to avoid the race condition described above, which is handled for you.

If you want to create your own session subclass, all you need to do is
override the following methods (none of which touch the locking mechanism).

  __init__()
  do_load()
  do_save()
  do_delete()
  do_cleanup()

do_cleanup will just register a callback function which does the actual
cleanup of expired sessions. If you are not careful you can introduce a
race condition there which could delete a valid session. It's not
difficult, but you do need to be careful. See the discussion in
developer's archive that Graham mentioned.

> Sorry i am a newbie in multi-thread and this mutex crap :)
> Would be nice if anyone give me a link or a name of a good books in
> this theme.

Friend, thy name is Google. :)

Jim


More information about the Mod_python mailing list