[mod_python] ValueError: Failed to acquire global mutex lock

David Cardozo david.cardozo at apioutsourcing.com
Thu Aug 7 09:05:32 EDT 2008


On 8/6/08 10:52 PM, "Graham Dumpleton" <graham.dumpleton at gmail.com> wrote:

> 2008/8/7 David Cardozo <david.cardozo at apioutsourcing.com>:
>> I concur; this is not strictly a mod_python problem, but rather caused by
>> the default implementation chosen by APR.
>> I remember coming across the MODPYTHON-202 bug report, but couldn't find it
>> again to put is as a reference in my previous posting. Thanks for looking it
>> up. Is there any chance of this been released soon?
> 
> No. That particular issue would be very low priority in the grander
> scheme of things.
> 
>> Is there a new version
>> of mod_python planned for sometime in the near future?
> 
> No. I have been trying to commit a few changes to repository for more
> important things, but have very little time to even do that. Of the
> people who have commit access, I am the only one who even answers
> stuff on mailing list these days and I myself haven't even used
> mod_python for couple of years. :-)

Then I feel obligated to ask what other scripting language have you used
lately with Apache... Is there a second best to python :-)

David.

> 
> Graham
> 
>> No, we don't have Apache directories mounted via NFS.
>> 
>> David.
>> 
>> On 8/6/08 6:23 PM, "Graham Dumpleton" <graham.dumpleton at gmail.com> wrote:
>> 
>> 2008/8/7 David Cardozo <david.cardozo at apioutsourcing.com>:
>>> I apologize by resurrecting this email thread, but I thought I would post
>>> the "what worked for me" solution in case someone else ever come across
>>> it.
>>> 
>>> For a quick history: when Mike Robokoff worked on this issue back in
>>> October
>>> (we both worked together at the same company), we were able to work around
>>> it by making changes to our python scripts to minimize the time spent
>>> holding a lock to the Session.dbm file. This seemed to work fine for a
>>> while. However, as our user load level increased, this problem started
>>> manifesting itself again. I dedicated myself to this issue during this
>>> past
>>> week and finally was able to narrow it down to the fix that solve it...
>>> Hopefully once and for all.
>>> 
>>> Here is my analysis of this issue; please correct me if you think
>>> something
>>> doesn't sound right:
>>> 
>>> The root cause of this problem was the use of a bad mutex implementation
>>> by
>>> mod_python in Solaris.
>> 
>> No. The mod_python package just uses what APR has selected as the
>> default. If there is a problem with fcntl on Solaris then you need to
>> be asking the APR folks why they are defaulting to using that over
>> something that works properly. So, not strictly a mod_python problem
>> and the fact that you also had to set SSLMutex and AcceptMutex
>> reinforces that.
>> 
>> Note that --enable-nonportable-atomics wouldn't make a difference as
>> that likely only applies to in process replacement for certain uses of
>> local thread mutex. A global mutex is a lock that works across
>> processes and so no machine code level thing could in itself work as
>> far as I can see.
>> 
>> As far as mod_python goes, only change I can see you might benefit from is:
>> 
>>   http://issues.apache.org/jira/browse/MODPYTHON-202
>> 
>> That is, mod_python providing a directive akin to AcceptMutex to
>> override what type of mutex it uses. That way you wouldn't have to
>> change the code, but could configure it instead.
>> 
>> BTW, you don't have your Apache installation, including directory
>> where locks are, being served off an NFS partition do you. File based
>> locking used to at least have some issues with NFS. This was some time
>> back though in Solaris and presume they would have addressed it by
>> now.
>> 
>> Graham
>> 
>>> Mod_python relies on operating system mutexes to lock file system
>>> resources;
>>> these mutexes are provided by the APR library (Apache Portable Runtime) by
>>> invocation of the following method from mod_python.c:
>>> 
>>>  apr_global_mutex_create(&mutex[n], fname, APR_LOCK_DEFAULT, p);
>>> 
>>> The above call defaults to use the operating system default mutex
>>> implementation (fcntl) which, in my opinion, does not work properly in an
>>> Apache worker MPM model under any type of load.
>>> 
>>> by modifying the above call to:
>>> 
>>> apr_global_mutex_create(&mutex[n], fname,  APR_LOCK_POSIXSEM, p);
>>> 
>>> which forces the use of the POSIX semaphores implementation of mutexes by
>>> mod_python, the problem was minimized to a just a few instances of the
>>> "mutex" error under a moderate load: 400 concurrent threads.
>>> 
>>> The final change to make the problem go away completely was to modify the
>>> AcceptMutex and SSLMutex properties in the httpd.conf file to "posixsem".
>>> This, together with the mod_python changed, made the "mutex" error go away
>>> even for 6400 concurrent threads.
>>> 
>>> I still ignore why the changes to AcceptMutex and SSLMutex are required
>>> since our Apache installs were compiled with the
>>> "--enable-nonportable-atomics" which, according to the Apache
>>> documentation
>>> ( http://httpd.apache.org/docs/2.2/misc/perf-tuning.html#compiletime):
>>> 
>>> "Solaris on SPARC
>>> By default, APR uses mutex-based atomics on Solaris/SPARC. If you
>>> configure
>>> with --enable-nonportable-atomics, however, APR generates code that uses a
>>> SPARC v8plus opcode for fast hardware compare-and-swap. If you configure
>>> Apache with this option, the atomic operations will be more efficient
>>> (allowing for lower CPU utilization and higher concurrency), but the
>>> resulting executable will run only on UltraSPARC chips. "
>>> 
>>> Maybe, even though we compile with that option, something in the OS,
>>> Apache
>>> or APR, still doesn't allow us to take advantage of atomic operations.
>>> This
>>> is pure speculation.
>>> 
>>> This is the email thread that pointed me to the right direction:
>>> http://www.modpython.org/pipermail/mod_python/2006-November/022538.html
>>> I'm not using the ITK MPM, just straight worker MPM, but that change made
>>> the trick for me.
>>> 
>>> Regards,
>>> 
>>> David.
>>> 
>>> -------------- Last message in email thread ---------------------------
>>> 
>>> [mod_python] ValueError: Failed to acquire global mutex lock
>>> Graham Dumpleton graham.dumpleton at gmail.com
>>> Wed Oct 24 23:04:05 EDT 2007
>>> 
>>>     * Previous message: [mod_python] Problems with Apache
>>>     * Next message: [mod_python] Multiple Django Applications
>>>     * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>>> 
>>> Going by errors posted by someone else for other thread, seems that if
>>> semaphores exhausted, error would occur at mod_python startup not when
>>> locking the mutex.
>>> 
>>> I'll have to go back and read all your emails again. You said you were
>>> using 'worker' MPM now didn't you. Your not using some strange MPM
>>> like perchild or ITK-MPM are you. I know that these cause problems for
>>> these mutexes in mod_python because of how different processes wanting
>>> to lock the mutex run as different users.
>>> 
>>> Graham
>>> 
>>> On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>>> wrote:
>>>> # ipcs
>>>> IPC status from <running system> as of Wed Oct 24 06:25:17 CDT 2007
>>>> T         ID      KEY        MODE        OWNER    GROUP
>>>> Message Queues:
>>>> Shared Memory:
>>>> Semaphores:
>>>> s          2   0          --ra-------      api    other
>>>> 
>>>> Not sure how to read that.
>>>> 
>>>> If the problem is the semaphores, Shouldn't I be able to use the
>>>> following
>>>> set semsys:seminfo_semmni=2048
>>>> set semsys:seminfo_semmns=2048
>>>> set semsys:seminfo_semmnu=1024
>>>> set semsys:seminfo_semmsl=300
>>>> set semsys:seminfo_semopm=128
>>>> set semsys:seminfo_semume=64
>>>> 
>>>> to make more semaphores available?
>>>> 
>>>> I tried that but it didn't change anything, do you think that was not
>>>> enough?
>>>> 
>>>> 
>>>> --Mike
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
>>>> Sent: Tuesday, October 23, 2007 7:08 PM
>>>> To: Michael Robokoff
>>>> Cc: mod_python
>>>> Subject: Re: [mod_python] ValueError: Failed to acquire global mutex lock
>>>> 
>>>> If you run 'ipcs' what is the output? Something must be using all the
>>>> semaphores, can't be anything else.
>>>> 
>>>> Graham
>>>> 
>>>> On 24/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>>>> wrote:
>>>>> Ok the directive did take effect as the log entry below shows:
>>>>> 
>>>>> [Tue Oct 23 07:20:32 2007] [notice] mod_python: Creating 4 session
>>>>> mutexes
>>>>> based on 6 max processes and 25 max threads.
>>>>> [Tue Oct 23 07:20:32 2007] [notice] mod_python: using mutex_directory
>>>>> /tmp
>>>>> 
>>>>> 
>>>>> Still see this however:
>>>>> 
>>>>> ValueError: Failed to acquire global mutex lock
>>>>> 
>>>>> I will try recompiling with the option you mentioned and see what
>>>>> happens.
>>>>> 
>>>>> 
>>>>> --Mike
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
>>>>> Sent: Tuesday, October 23, 2007 6:40 AM
>>>>> To: Michael Robokoff
>>>>> Cc: mod_python
>>>>> Subject: Re: [mod_python] ValueError: Failed to acquire global mutex
>>>>> lock
>>>>> 
>>>>> On 23/10/2007, Michael Robokoff <mike.robokoff at apioutsourcing.com>
>>>>> wrote:
>>>>>> [Thu Oct 18 07:37:38 2007] [notice] mod_python: Creating 8 session
>>>> mutexes
>>>>>> based on 6 max processes and 25 max threads.
>>>>> 
>>>>> BTW, what did you end up setting mod_python.mutex_locks to? If you use
>>>>> 4 like I said, then it can't have been in correct part of Apache
>>>>> configuration, outside of all VirtualHost, as error log still shows '8
>>>>> session mutexes'.
>>>>> 
>>>>> if you can't get this to work, you might rebuild mod_python and
>>>>> specific --with-max-locks=4 option to configure to force lower value
>>>>> to be compiled in.
>>>>> 
>>>>> Graham
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> _______________________________________________
>>> Mod_python mailing list
>>> Mod_python at modpython.org
>>> http://mailman.modpython.org/mailman/listinfo/mod_python
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> David Cardozo
>> Software Systems Architect
>> 
>> 2975 Lone Oak Drive  Suite 100  Eagan, MN  55121
>> Direct: 651-675-2604 Fax: 651-675-2699
>> Accounts Payable Transformation Research Report
>> <http://www.apifao.com/company/AberdeenEquinoxReportMarch2008.pdf>
>> www.apifao.com <http://www.apifao.com>
>> 
> 
> 



David Cardozo
Software Systems Architect

2975 Lone Oak Drive  Suite 100  Eagan, MN  55121
Direct: 651-675-2604 Fax: 651-675-2699
Accounts Payable Transformation Research Report
<http://www.apifao.com/company/AberdeenEquinoxReportMarch2008.pdf>
www.apifao.com <http://www.apifao.com> 




More information about the Mod_python mailing list