[mod_python] ValueError: Failed to acquire global mutex lock

Wed Aug 6 23:52:30 EDT 2008

2008/8/7 David Cardozo <david.cardozo at apioutsourcing.com>:
> I concur; this is not strictly a mod_python problem, but rather caused by
> the default implementation chosen by APR.
> I remember coming across the MODPYTHON-202 bug report, but couldn't find it
> again to put is as a reference in my previous posting. Thanks for looking it
> up. Is there any chance of this been released soon?

No. That particular issue would be very low priority in the grander
scheme of things.

> Is there a new version
> of mod_python planned for sometime in the near future?

No. I have been trying to commit a few changes to repository for more
important things, but have very little time to even do that. Of the
people who have commit access, I am the only one who even answers
stuff on mailing list these days and I myself haven't even used
mod_python for couple of years. :-)

Graham

> No, we don't have Apache directories mounted via NFS.
>
> David.
>
> On 8/6/08 6:23 PM, "Graham Dumpleton" <graham.dumpleton at gmail.com> wrote:
>
> 2008/8/7 David Cardozo <david.cardozo at apioutsourcing.com>:
>> I apologize by resurrecting this email thread, but I thought I would post
>> the "what worked for me" solution in case someone else ever come across
>> it.
>>
>> For a quick history: when Mike Robokoff worked on this issue back in
>> October
>> (we both worked together at the same company), we were able to work around
>> it by making changes to our python scripts to minimize the time spent
>> holding a lock to the Session.dbm file. This seemed to work fine for a
>> while. However, as our user load level increased, this problem started
>> manifesting itself again. I dedicated myself to this issue during this
>> past
>> week and finally was able to narrow it down to the fix that solve it...
>> Hopefully once and for all.
>>
>> Here is my analysis of this issue; please correct me if you think
>> something
>> doesn't sound right:
>>
>> The root cause of this problem was the use of a bad mutex implementation
>> by
>> mod_python in Solaris.
>
> No. The mod_python package just uses what APR has selected as the
> default. If there is a problem with fcntl on Solaris then you need to
> be asking the APR folks why they are defaulting to using that over
> something that works properly. So, not strictly a mod_python problem
> and the fact that you also had to set SSLMutex and AcceptMutex
> reinforces that.
>
> Note that --enable-nonportable-atomics wouldn't make a difference as
> that likely only applies to in process replacement for certain uses of
> local thread mutex. A global mutex is a lock that works across
> processes and so no machine code level thing could in itself work as
> far as I can see.
>
> As far as mod_python goes, only change I can see you might benefit from is:
>
>   http://issues.apache.org/jira/browse/MODPYTHON-202
>
> That is, mod_python providing a directive akin to AcceptMutex to
> override what type of mutex it uses. That way you wouldn't have to
> change the code, but could configure it instead.
>
> BTW, you don't have your Apache installation, including directory
> where locks are, being served off an NFS partition do you. File based
> locking used to at least have some issues with NFS. This was some time
> back though in Solaris and presume they would have addressed it by
> now.
>
> Graham
>
>> Mod_python relies on operating system mutexes to lock file system
>> resources;
>> these mutexes are provided by the APR library (Apache Portable Runtime) by
>> invocation of the following method from mod_python.c:
>>
>>  apr_global_mutex_create(&mutex[n], fname, APR_LOCK_DEFAULT, p);
>>
>> The above call defaults to use the operating system default mutex
>> implementation (fcntl) which, in my opinion, does not work properly in an
>> Apache worker MPM model under any type of load.
>>
>> by modifying the above call to:
>>
>> apr_global_mutex_create(&mutex[n], fname,  APR_LOCK_POSIXSEM, p);
>>
>> which forces the use of the POSIX semaphores implementation of mutexes by
>> mod_python, the problem was minimized to a just a few instances of the
>> "mutex" error under a moderate load: 400 concurrent threads.
>>
>> The final change to make the problem go away completely was to modify the
>> AcceptMutex and SSLMutex properties in the httpd.conf file to "posixsem".
>> This, together with the mod_python changed, made the "mutex" error go away
>> even for 6400 concurrent threads.
>>
>> I still ignore why the changes to AcceptMutex and SSLMutex are required
>> since our Apache installs were compiled with the
>> "--enable-nonportable-atomics" which, according to the Apache
>> documentation
>> ( http://httpd.apache.org/docs/2.2/misc/perf-tuning.html#compiletime):
>>
>> "Solaris on SPARC
>> By default, APR uses mutex-based atomics on Solaris/SPARC. If you
>> configure
>> with --enable-nonportable-atomics, however, APR generates code that uses a
>> SPARC v8plus opcode for fast hardware compare-and-swap. If you configure
>> Apache with this option, the atomic operations will be more efficient
>> (allowing for lower CPU utilization and higher concurrency), but the
>> resulting executable will run only on UltraSPARC chips. "
>>
>> Maybe, even though we compile with that option, something in the OS,
>> Apache
>> or APR, still doesn't allow us to take advantage of atomic operations.
>> This
>> is pure speculation.
>>
>> This is the email thread that pointed me to the right direction:
>> http://www.modpython.org/pipermail/mod_python/2006-November/022538.html
>> I'm not using the ITK MPM, just straight worker MPM, but that change made
>> the trick for me.
>>
>> Regards,
>>
>> David.
>>
>> -------------- Last message in email thread ---------------------------
>>
>> [mod_python] ValueError: Failed to acquire global mutex lock
>> Graham Dumpleton graham.dumpleton at gmail.com
>> Wed Oct 24 23:04:05 EDT 2007
>>
>>     * Previous message: [mod_python] Problems with Apache
>>     * Next message: [mod_python] Multiple Django Applications
>>     * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>>
>> Going by errors posted by someone else for other thread, seems that if
>> semaphores exhausted, error would occur at mod_python startup not when
>> locking the mutex.
>>
>> I'll have to go back and read all your emails again. You said you were
>> using 'worker' MPM now didn't you. Your not using some strange MPM
>> like perchild or ITK-MPM are you. I know that these cause problems for
>> these mutexes in mod_python because of how different processes wanting
>> to lock the mutex run as different users.
>>
>> Graham
>>
>> On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>> wrote:
>>> # ipcs
>>> IPC status from <running system> as of Wed Oct 24 06:25:17 CDT 2007
>>> T         ID      KEY        MODE        OWNER    GROUP
>>> Message Queues:
>>> Shared Memory:
>>> Semaphores:
>>> s          2   0          --ra-------      api    other
>>>
>>> Not sure how to read that.
>>>
>>> If the problem is the semaphores, Shouldn't I be able to use the
>>> following
>>> set semsys:seminfo_semmni=2048
>>> set semsys:seminfo_semmns=2048
>>> set semsys:seminfo_semmnu=1024
>>> set semsys:seminfo_semmsl=300
>>> set semsys:seminfo_semopm=128
>>> set semsys:seminfo_semume=64
>>>
>>> to make more semaphores available?
>>>
>>> I tried that but it didn't change anything, do you think that was not
>>> enough?
>>>
>>>
>>> --Mike
>>>
>>>
>>> -----Original Message-----
>>> From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
>>> Sent: Tuesday, October 23, 2007 7:08 PM
>>> To: Michael Robokoff
>>> Cc: mod_python
>>> Subject: Re: [mod_python] ValueError: Failed to acquire global mutex lock
>>>
>>> If you run 'ipcs' what is the output? Something must be using all the
>>> semaphores, can't be anything else.
>>>
>>> Graham
>>>
>>> On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>>> wrote:
>>> > Ok the directive did take effect as the log entry below shows:
>>> >
>>> > [Tue Oct 23 07:20:32 2007] [notice] mod_python: Creating 4 session
>>> > mutexes
>>> > based on 6 max processes and 25 max threads.
>>> > [Tue Oct 23 07:20:32 2007] [notice] mod_python: using mutex_directory
>>> > /tmp
>>> >
>>> >
>>> > Still see this however:
>>> >
>>> > ValueError: Failed to acquire global mutex lock
>>> >
>>> > I will try recompiling with the option you mentioned and see what
>>> > happens.
>>> >
>>> >
>>> > --Mike
>>> >
>>> >
>>> > -----Original Message-----
>>> > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
>>> > Sent: Tuesday, October 23, 2007 6:40 AM
>>> > To: Michael Robokoff
>>> > Cc: mod_python
>>> > Subject: Re: [mod_python] ValueError: Failed to acquire global mutex
>>> > lock
>>> >
>>> > On 23/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>>> > wrote:
>>> > > [Thu Oct 18 07:37:38 2007] [notice] mod_python: Creating 8 session
>>> mutexes
>>> > > based on 6 max processes and 25 max threads.
>>> >
>>> > BTW, what did you end up setting mod_python.mutex_locks to? If you use
>>> > 4 like I said, then it can't have been in correct part of Apache
>>> > configuration, outside of all VirtualHost, as error log still shows '8
>>> > session mutexes'.
>>> >
>>> > if you can't get this to work, you might rebuild mod_python and
>>> > specific --with-max-locks=4 option to configure to force lower value
>>> > to be compiled in.
>>> >
>>> > Graham
>>> >
>>> >
>>> >
>>>
>>>
>>>
>> _______________________________________________
>> Mod_python mailing list
>> Mod_python at modpython.org
>> http://mailman.modpython.org/mailman/listinfo/mod_python
>>
>>
>
>
>
>
>
> David Cardozo
> Software Systems Architect
>
> 2975 Lone Oak Drive  Suite 100  Eagan, MN  55121
> Direct: 651-675-2604 Fax: 651-675-2699
> Accounts Payable Transformation Research Report
> <http://www.apifao.com/company/AberdeenEquinoxReportMarch2008.pdf>
> www.apifao.com <http://www.apifao.com>
>