Martijn Moeling
martijn at xs4us.nu
Wed Jul 19 09:53:29 EDT 2006
Ok, some answers came in, but nobody seems to understand the first part. I guess you guys understand that I am trying to optimize for performance, Since I configured apache in such a manner that every request ends up in the same handler, so I have one python module and one handler for a number of different websites/domains. I have more but I'll limit to two: zoekveilig.nl mkbok.nl what id do (not real code) def handler(req) if req.host.find("mkbok.nl")> -1: # the host part could be XXX.mkbok.nl # the host part could be xxx.yyy.mkbok.nl import mkbOK as mkbOK x = mkbOK.init() # create the class instance page=x.start(req) if req.host=="www.zoekveilig.nl" import zoekveilig as zoekveilig x = zoekveilig.init) # create class instance page=x.start(req) ..... req.write(page) return apache.OK in my real world it is a bit more complicated since I do not configure the sites in code but the above does indicate the workings, it is "Functional description" code. Next, all content other than the default page (index.html alike) for that website (bit confusing on mkbOK since the hostpart is dynamic) and as I said before it is derived from the host part for further processing. If mod python is running in Interpreter per directory mode, one interpreter is created since all my content for mkbOK resides in / in total over 1 in total over 14.000 different pages, and since we have over 10.000 pageviews per day and aim for 100.000+ per day at the end of the year I am preparing for a second server (which my system can handle) If mod python is running in Interpreter per directive mode I can and up with god knows how many interpreters. Both of the above seem wrong for best performance, but I'm puzzled how to optimize, Basically that is the question... The register cleanup is clear now, since my system creates the database connection in the class module (the init call creates the class) I might have to alter that but. The system goes from normal cpu utilization to 100% within a few microseconds, and it happens now and then, sometimes within a few hours after a reboot, sometimes it runs for weeks without trouble The number of pages does not seem to matter, and it mostly happens when it is NOT busy. I tried multiple cron thingies to investigate, but even cron slows down so mutch that a "service httpd restart, and/or a service mysql restart" take hours to complete, I did something like this as a shell script: (runs every 5 minutes) Find mysqlpid top -n 1 -p mysqlpid (run top once and look for mysql) if sed (to see wether it runs with more than 50% cpu) write timestamp to log file service httpd stop write timestamp to log file service mysqld restart write timestamp to log file service httpd start write timestamp to log file in fact (but keep in mind I have had no interactive access) I think mysql stops responding at all even to signals. I even tried "nice" in the hope that mysql could not take 100% but that was not the case and it slowed down the page building process (not surprised haha). Even installing a second CPU did not help. The even more stupid thing is that this behavior does not happen on a PIII 1 Ghz with a excact copy of the HDD (dd if=/dev/hd1 of=/dev/hd2) Since the cpu in our production machine is 64 bit I suspected that, and build apache, and mod_python and python all from scratch,.. no luck. Different mysql versions did not matter too. The oddest thing is that after an update of my python code on the server (new release of my system) is takes 1 or 2 days before it happens, than it takes say 4 or 5 days, next it runs ok for weeks. And YES we even replaced the entire machine with a different one (our server is a Dell 1U rack server, don't know the excact type, sorry) Any ideas, I am a experienced systems manager and programmer but just hoped I am not the only one with this oddity Martijn -----Oorspronkelijk bericht----- Van: Jim Gallacher [mailto:jpg at jgassociates.ca] Verzonden: Wednesday, July 19, 2006 14:53 Aan: Martijn Moeling CC: mod_python at modpython.org Onderwerp: Re: [mod_python] modpython, mysqldb best paractice Martijn Moeling wrote: > Hi all, > > > > I started the development of a specific kind of CMS over 2 years ago, > and due to sparse documentation I am still puzzled about a few things. > > Basically it consist of one .py and a mysqldatabase with all the data > and templates, ALL pages are generated on the fly. > > > > First of all I am confused about PythonInterPerDirectory and > PythonInterpPerDirectory > > In the way I use modpython. > > > > My apache has no configured virtual hosts since my CMS can handle this > on its own by looking at req.host > > On one site which is running on my system ( http://www.mkbok.nl > <http://www.mkbok.nl/> ) we use different subdomains, so basically > > the page to be build is derived from the req.host (e.g. > xxx.yyy.mkbok.nl) where xxx and yyy are variable. > > This is done by DNS records like * A ip1.ip2.ip3.ip4 You don't actually state what problem here. ;) > My next problem seems mysql and mysqldb. > > Since I do not know which website is requested (multiple are running on > that server) I open a database connection, do my stuff and close the > connection, > > Again the database selection comes from req.host, and here the > domainname is used for database selection. > > > > The system runs extremely well, but once in a while the webserver > becomes so busy that it does not respond to page request anymore. > > We suspect that mysql is the problem here since the only thing we can > see is mysql is consuming more and more swapspace and at some point it > runs out of resources and starts looping. At that point the system > (Linux) keeps running but with a 100% cpu utilization and we are unable > to login and investigate. > > So logging in to the UPS remotely and power down the system by virtually > unplugging the cable is the only (and BAD) solution. Ouch. Maybe you can run a cron job every 5 minutes to check the load and try to catch the problem before you hit 100%? I'm not suggesting this is a permanent solution, just do it until you can track down the cause. Is there a chance that mysql is hitting its connection limit? (although I'm not sure if that would cause the behaviour you describe). > > So what is best practice when you have to connect to mysql with mysqldb > in a mod_python environment keeping in mind that the database connection > has to be build every time a visitor requests a page? Think in terms of > a "globally" available db or db.cursor connection. I don't think the performance penalty for creating a connection to mysql is too great - at least compared to some other databases. You might want to google for more information. > Since global variables are troublesome in the .py contaning the handler > I use a class from which an instance is created every time a client > connects and the DB connection is global to that class, ist that wrong? This looks OK. > > > What happenens if mod_python finds an error before my mysqldb connection > is closed (not that this happenes a lot, but it does happen, sorry) It depends on how you handle the exception. This is why you should close the connection in a registered cleanup function, which will always run. > > Also I do not understand the req.register_cleanup() method, what is > cleaned up and what not? Whatever function you register is run during the cleanup phase (unless mod_python segfaults - but then you've got other problems). The cleanup phase occurs after the response has been sent and anything registered is guaranteed to run, regardless of what happens in prior phases. Typical usage looks like this: def handler(req): conn = MySQLdb.connect(db='blah', user='blah', passwd='blah') req.register_cleanup(db_cleanup, conn) ... do your request handling ... return apache.OK def db_cleanup(connection): connection.close() Jim
|