[mod_python] Custom handler thread safety

Sat Feb 5 14:03:45 EST 2005

Hello Graham,

All of the problems you describe and the scenarios you provided were 
right on target as to the kind of issues I am running into.  Here is the 
response to your question:

 > Anyway, confirm exactly what you meant by having thread locking going
 > into the handler and describe how you are managing the database
 > connections. Do you create a certain number of database connections at
 > startup, or on demand as required, but only up to a certain maximum?
 > Is your connection caching mechanism thread protected in any way?

Pretty much, the way i have setup the handler is as follows:

    *def handler(req):
        _lock.acquire()
       try:
          # instantiate and call my driver class that processes the request
       finally:
          _lock.release()*    

You were exactly right about what I was experiencing in that each thread 
was waiting on another to release the lock before processing so the 
processing of the threads with 20 concurrent users was serialized.  
Taking that into account, I removed the locking from the handler and ran 
the 20 concurrent users test again and then I started running into 
problems with the MySQL server dropping connections due to there being 
too many simultaneous connections being made from my application.

To answer your second question of how I am managing the database 
connections, pretty much it is very simple.  My framework, was 
originally written by a friend of mine to work only in CGI about 3 years 
ago.  So, in the Data Access Layer, he has a class that manages 
connections to different sources being, databases, servers, and payment 
processors.  For database connections, there is a dictionary that holds 
all the connections.  Because, it was originally written for CGI, he 
saved a connection to a database in the dictionary using the key 
"dbname:thread id".  This worked fine, because the environment wasn't 
multi threaded, so there was always only one thread id for all 
requests.  So all the db connections were created on demand and then 
cached and then all subsequent requests for the same db connections were 
then given back from cache.  So it works as follows:

    *class DALResources:

        def getDB(self, db_name):
           resource_name = "%s:%s" % (db_name,thread_id)
           db_conn = self.__resource_dict.get(resource_name,None)

           if db_conn == None:
              # create a db connections
              # cache it in the __resource_dict

           return db_conn*

So for each db, there is only one connection cached and that is the one 
that is shared.  The thread_id caused a problem in the multithreaded 
environment because there could be a huge number of threads that are 
processing requests and as a result too many connections were made.  I 
tried to go around this problem by removing the thread_id from the 
resource_name, however, I still started running into MySQL connection 
problems.

My caching mechanism is not thread protected at all.  Pretty much the 
DALResources class is created as a singleton and imported in my handler 
so that is available for all child threads in a process.  One thing I 
did yesterday was to try to make my DB Connection object thread safe by 
locking the connection where it queries the db.  That actually, resulted 
in no more connection problems to the db.

However, I still run into various problems, and I am very sure that is 
because there is no locking mechanism available for all the objects I am 
sharing by putting it in shared memory.   What would you suggest I do in 
terms of locking objects?  Should I make the objects I am storing in the 
cache thread safe or should I create one Caching object and store all my 
shared objects in that and add locking to the get and set methods?

Thanks again for all your help and I hope I explained all the items you 
asked for.

Hozi

Graham Dumpleton wrote:

> I'll try and get back to your original problem. I probably digressed
> and certainly said some things already that most likely aren't relevant
> at all. I should have perhaps read the email properly. My excuse is that
> I was having a busy day that day. :-)
>
> On 05/02/2005, at 6:56 AM, Huzaifa Tapal wrote:
>
>>> Is the Python database connection object internally thread safe?
>>
>>
>> -- not its not.
>
>
> I should have phrased this question a bit better. Does the database
> interface code support having multiple connection objects active at
> the same time, where each may be held by a distinct thread with that
> thread doing whatever it wants with its own connection object?
>
> Thus, a single database connection object doesn't necessarily have
> to be thread safe in itself, as long as there can be multiple connection
> objects each being used at the same time by different threads.
>
> Anyway, in your original email you said:
>
>   Just to be safe, I implemented thread locking into the handler
>   before any request is processed.
>
> That to me says what you did was to make sure only one request at a
> time could actually do anything. Thus all requests were serialised.
> If this was the case, you wouldn't have had any thread problems because
> there wouldn't have actually been multiple requests active, a latter
> request would have sat there until the previous one finished.
>
> If this is true, using a multithreaded MPM would also have just made
> things worse. You would have been better of using Apache in "prefork"
> mode as then each request in each process could have at least run in
> parallel.
>
> Next you said:
>
>   We are gaining huge performance increases by caching our template
>   objects and db connection objects.
>
> Which is logical, as you have avoided the startup cost with creating
> a database connection for each request, as well as the cost of loading
> a template on every request. This would be true whether or not you are
> using threads.
>
> Your next comment was:
>
>   The problem I am running into is that if I run through the application,
>   each request takes on average 300 ms to process.  However, when we
>   benchmark with 20 concurrent users, the average goes up to around
>   2200 ms.  I am very sure that this is due to a thread locking
>   shared objects in memory which results in another thread waiting for
>   the lock to be released.
>
> If each thread was trying to acquire the same lock before going into a
> handler and only releasing it when exiting the handler, thus serialising
> requests, what you are seeing would be expected. In short you were
> simply overloading your servers ability to respond quickly enough. Add
> even more concurrent users and the average would like keep growing.
>
> Finally you said:
>
>   If I take the thread locking mechanism out then we run into problems
>   with there being too many connections being made to the MySQL db if
>   the cached connection is being used and then the db starts dropping
>   connections.
>
> If there was indeed a lock around any handler call and you took it out,
> you would at least still need to thread protect your database connection
> pool/cache.
>
> You might need to explain how you manage your database connections.
>
> I haven't done database connection pooling in mod_python when using
> threads yet but there are others here who have and may suggest the 
> best ways
> of doing it.
>
> To me the simplest way would be to create a set of database connection
> objects at startup and place these in a Queue.Queue object. As each 
> request
> comes in, it can get an available database connection off the queue,
> use it then put it back. In practice, it probably needs to be a bit more
> robust than that.
>
> Anyway, confirm exactly what you meant by having thread locking going
> into the handler and describe how you are managing the database
> connections. Do you create a certain number of database connections at
> startup, or on demand as required, but only up to a certain maximum?
> Is your connection caching mechanism thread protected in any way?
>
> Sorry again for getting off the track. :-)
>
> Graham
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20050205/df7c395d/attachment.html