[mod_python] Very odd behavior using mysqldb

Eric Brunson brunson at brunson.com
Fri Sep 15 14:41:36 EDT 2006

I've been working on a project using mod_python to access a mysql 
database using the MySQLdb API.  I know mod_python is still considered 
beta under Apache 2, so I thought I'd bring an issue to the attention of 
the list.

I was seeing some very bizarre results when generating dynamic content 
from the database, which I first attributed to PSP caching, but then 
investigated further when I switched to using the publisher handler.  It 
may be the way I've set up my modules and database connections, but I 
haven't convinced myself that I really understand the problem and until 
I do, I'm not going to be happy about it.  I have found a workaround 
that I can live with, but I don't really like it that much because it 
requires constantly connecting to and disconnection from the database.

The problem was simply getting old, wrong results interleaved with new, 
correct results after updating the database.  Simply put, I'm displaying 
data from a table.  My index.py imports a module for generating HTML 
which, in turn, includes a module for doing database access.  The odd 
thing about how I set up the database access, and I think it's pertinent 
to the problem, was that I automatically connect to the database when I 
import the database module.  Like this:

----- database.py -----
import MySQLdb
from mod_python import apache

def Connect():
    return MySQLdb.connect( 'mysql://me:[email protected]/mydatabase' )

dbc = Connect()

def someExampleAccessFunction():
    global dbc
    curs = dbc.cursor()
    curs.execute( "select count(0) from sometable" )
    result = curs.fetchall()[0][0]
    apache.log_error( "result was %s" % result )
    return result

----- end database.py -----

Let's disregard whether that's good programming practice or not, the 
fact is, I could query that value several times by doing a page reload, 
then delete a record from the database and commit the change, then 
subsequent page loads would interleave old results with new results, 
seemingly cyclically.

The key to my confusion is that the error logs for apache would show a 
message for every page reload, and the result from the query would agree 
with whatever data was displayed on the page, correct or incorrect.  
Meanwhile a direct query to the database from the command line would 
always return the correct result from the database.

My best guess so far is that multiple threads in apache get multiple 
copies of mod_python, each with it's own copy of the imported database 
module in memory and each module maintaining a different database 
connection that is somehow caching results.  I've never seen this 
behavior in any application until I tried this under mod_python.  I was 
able to force the correct results to always be returned by getting rid 
of the module variable, dbc, and moving the mysqldb.connect() call to 
inside each function call, opening the database connection, executing my 
sql, then closing the database connection.  This is less than optimal, 
since I may make several database queries during each page load, so 
later today I'll try making the database connection in the index.py, 
then passing the connection handle to each of the calls that need it.  
Also a little less than optimal, but tolerable.

Does anyone have any insight into this behavior?  It's more of an 
academic question at this point, but I wonder if it may lead to an 
improvement if we were able to identify the problem and implement a 
transparent fix.  Also, feel free to let me know if I'm completely 
offbase with my theory of what's going on under the hood.


More information about the Mod_python mailing list