Graham Dumpleton
graham.dumpleton at gmail.com
Wed Aug 6 19:23:08 EDT 2008
2008/8/7 David Cardozo <david.cardozo at apioutsourcing.com>: > I apologize by resurrecting this email thread, but I thought I would post > the "what worked for me" solution in case someone else ever come across it. > > For a quick history: when Mike Robokoff worked on this issue back in October > (we both worked together at the same company), we were able to work around > it by making changes to our python scripts to minimize the time spent > holding a lock to the Session.dbm file. This seemed to work fine for a > while. However, as our user load level increased, this problem started > manifesting itself again. I dedicated myself to this issue during this past > week and finally was able to narrow it down to the fix that solve it... > Hopefully once and for all. > > Here is my analysis of this issue; please correct me if you think something > doesn't sound right: > > The root cause of this problem was the use of a bad mutex implementation by > mod_python in Solaris. No. The mod_python package just uses what APR has selected as the default. If there is a problem with fcntl on Solaris then you need to be asking the APR folks why they are defaulting to using that over something that works properly. So, not strictly a mod_python problem and the fact that you also had to set SSLMutex and AcceptMutex reinforces that. Note that --enable-nonportable-atomics wouldn't make a difference as that likely only applies to in process replacement for certain uses of local thread mutex. A global mutex is a lock that works across processes and so no machine code level thing could in itself work as far as I can see. As far as mod_python goes, only change I can see you might benefit from is: http://issues.apache.org/jira/browse/MODPYTHON-202 That is, mod_python providing a directive akin to AcceptMutex to override what type of mutex it uses. That way you wouldn't have to change the code, but could configure it instead. BTW, you don't have your Apache installation, including directory where locks are, being served off an NFS partition do you. File based locking used to at least have some issues with NFS. This was some time back though in Solaris and presume they would have addressed it by now. Graham > Mod_python relies on operating system mutexes to lock file system resources; > these mutexes are provided by the APR library (Apache Portable Runtime) by > invocation of the following method from mod_python.c: > > apr_global_mutex_create(&mutex[n], fname, APR_LOCK_DEFAULT, p); > > The above call defaults to use the operating system default mutex > implementation (fcntl) which, in my opinion, does not work properly in an > Apache worker MPM model under any type of load. > > by modifying the above call to: > > apr_global_mutex_create(&mutex[n], fname, APR_LOCK_POSIXSEM, p); > > which forces the use of the POSIX semaphores implementation of mutexes by > mod_python, the problem was minimized to a just a few instances of the > "mutex" error under a moderate load: 400 concurrent threads. > > The final change to make the problem go away completely was to modify the > AcceptMutex and SSLMutex properties in the httpd.conf file to "posixsem". > This, together with the mod_python changed, made the "mutex" error go away > even for 6400 concurrent threads. > > I still ignore why the changes to AcceptMutex and SSLMutex are required > since our Apache installs were compiled with the > "--enable-nonportable-atomics" which, according to the Apache documentation > ( http://httpd.apache.org/docs/2.2/misc/perf-tuning.html#compiletime): > > "Solaris on SPARC > By default, APR uses mutex-based atomics on Solaris/SPARC. If you configure > with --enable-nonportable-atomics, however, APR generates code that uses a > SPARC v8plus opcode for fast hardware compare-and-swap. If you configure > Apache with this option, the atomic operations will be more efficient > (allowing for lower CPU utilization and higher concurrency), but the > resulting executable will run only on UltraSPARC chips. " > > Maybe, even though we compile with that option, something in the OS, Apache > or APR, still doesn't allow us to take advantage of atomic operations. This > is pure speculation. > > This is the email thread that pointed me to the right direction: > http://www.modpython.org/pipermail/mod_python/2006-November/022538.html > I'm not using the ITK MPM, just straight worker MPM, but that change made > the trick for me. > > Regards, > > David. > > -------------- Last message in email thread --------------------------- > > [mod_python] ValueError: Failed to acquire global mutex lock > Graham Dumpleton graham.dumpleton at gmail.com > Wed Oct 24 23:04:05 EDT 2007 > > * Previous message: [mod_python] Problems with Apache > * Next message: [mod_python] Multiple Django Applications > * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] > > Going by errors posted by someone else for other thread, seems that if > semaphores exhausted, error would occur at mod_python startup not when > locking the mutex. > > I'll have to go back and read all your emails again. You said you were > using 'worker' MPM now didn't you. Your not using some strange MPM > like perchild or ITK-MPM are you. I know that these cause problems for > these mutexes in mod_python because of how different processes wanting > to lock the mutex run as different users. > > Graham > > On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com> wrote: >> # ipcs >> IPC status from <running system> as of Wed Oct 24 06:25:17 CDT 2007 >> T ID KEY MODE OWNER GROUP >> Message Queues: >> Shared Memory: >> Semaphores: >> s 2 0 --ra------- api other >> >> Not sure how to read that. >> >> If the problem is the semaphores, Shouldn't I be able to use the following >> set semsys:seminfo_semmni=2048 >> set semsys:seminfo_semmns=2048 >> set semsys:seminfo_semmnu=1024 >> set semsys:seminfo_semmsl=300 >> set semsys:seminfo_semopm=128 >> set semsys:seminfo_semume=64 >> >> to make more semaphores available? >> >> I tried that but it didn't change anything, do you think that was not >> enough? >> >> >> --Mike >> >> >> -----Original Message----- >> From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] >> Sent: Tuesday, October 23, 2007 7:08 PM >> To: Michael Robokoff >> Cc: mod_python >> Subject: Re: [mod_python] ValueError: Failed to acquire global mutex lock >> >> If you run 'ipcs' what is the output? Something must be using all the >> semaphores, can't be anything else. >> >> Graham >> >> On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com> >> wrote: >> > Ok the directive did take effect as the log entry below shows: >> > >> > [Tue Oct 23 07:20:32 2007] [notice] mod_python: Creating 4 session >> > mutexes >> > based on 6 max processes and 25 max threads. >> > [Tue Oct 23 07:20:32 2007] [notice] mod_python: using mutex_directory >> > /tmp >> > >> > >> > Still see this however: >> > >> > ValueError: Failed to acquire global mutex lock >> > >> > I will try recompiling with the option you mentioned and see what >> > happens. >> > >> > >> > --Mike >> > >> > >> > -----Original Message----- >> > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] >> > Sent: Tuesday, October 23, 2007 6:40 AM >> > To: Michael Robokoff >> > Cc: mod_python >> > Subject: Re: [mod_python] ValueError: Failed to acquire global mutex >> > lock >> > >> > On 23/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com> >> > wrote: >> > > [Thu Oct 18 07:37:38 2007] [notice] mod_python: Creating 8 session >> mutexes >> > > based on 6 max processes and 25 max threads. >> > >> > BTW, what did you end up setting mod_python.mutex_locks to? If you use >> > 4 like I said, then it can't have been in correct part of Apache >> > configuration, outside of all VirtualHost, as error log still shows '8 >> > session mutexes'. >> > >> > if you can't get this to work, you might rebuild mod_python and >> > specific --with-max-locks=4 option to configure to force lower value >> > to be compiled in. >> > >> > Graham >> > >> > >> > >> >> >> > _______________________________________________ > Mod_python mailing list > Mod_python at modpython.org > http://mailman.modpython.org/mailman/listinfo/mod_python > >
|