[mod_python] Custom handler thread safety

Sat Feb 5 18:04:00 EST 2005

On 06/02/2005, at 6:03 AM, Huzaifa Tapal wrote:

>  So for each db, there is only one connection cached and that is the 
> one that is shared.  The thread_id caused a problem in the 
> multithreaded environment because there could be a huge number of 
> threads that are processing requests and as a result too many 
> connections were made.  I tried to go around this problem by removing 
> the thread_id from the resource_name, however, I still started running 
> into MySQL connection problems.

If you remove the thread ID, then looks like you have one connection 
object
per database. Thus if multiple threads want to use the same database, 
you
have contention again.

>  My caching mechanism is not thread protected at all.  Pretty much the 
> DALResources class is created as a singleton and imported in my 
> handler so that is available for all child threads in a process.  One 
> thing I did yesterday was to try to make my DB Connection object 
> thread safe by locking the connection where it queries the db.  That 
> actually, resulted in no more connection problems to the db.

Having a lock on the database connection object would avoid the 
contention
above, but causes all requests against a specific database to be 
serialised.
That is presuming the lock is on each instance of the database 
connection
and not one lock across all database connection objects.

Serialisation on just the database access is at least not as bad as it 
being
on a request as a whole, but still not ideal.

>  However, I still run into various problems, and I am very sure that 
> is because there is no locking mechanism available for all the objects 
> I am sharing by putting it in shared memory.   What would you suggest 
> I do in terms of locking objects?  Should I make the objects I am 
> storing in the cache thread safe or should I create one Caching object 
> and store all my shared objects in that and add locking to the get and 
> set methods?

This is where it gets harder for me to help. Knowing what is best 
generally
entails knowing quite a bit about the design of the application. There 
are
also various tradeoffs between making locking too fine grained and it 
being
at too high a level. Have it too fine grained and performance can drop, 
plus
you might not be adequately locking around combined updates across 
multiple
data structures. Have it too high and you risk loosing the benefits of 
what
threading gives you in the first place as things become serialised 
again.

Sounds though that at the moment this is the area you need to 
concentrate on.
The issue of database connection pooling could be improved, but will 
not mean
anything if the rest of the application stuffs up when it is accessing 
shared
data.

Probably the first thing to look at is what distinct high level 
functions does
your code provide. Having worked out that, identify what shared 
resources each
needs to access/update. Then look at a scheme whereby on entry to each 
high
level function, it locks just those resources it is going to use.

This is a bit better than arbitrarily locking everything before 
entering any
request, as the latter means that you lock stuff you aren't necessarily 
using
and thus could be locking out some other request that only needs to 
access
the data you aren't going to use.

This is now where the balancing acts comes in in terms of making things 
too
fine grained or not. Try to avoid having locks on every little 
resource. If
there are a group of resources which are always used together, have 
just one
lock that covers the lot. Also watch out for deadlock situations where 
two
different functions access two resources in the opposite order to each 
over.

It may take a while to get just right if you aren't too familiar with 
thread
programming and problems associated with locking, but play with it and 
see
how you go.

Graham