Huzaifa Tapal
huzaifa at hostway.com
Fri Aug 12 17:47:12 EDT 2005
No, no content was being served from ns, however, our sys dev team here did some investigation and here is what the found: in mod_python.c for the apr_status_t init_mutexes() method which I believe gets called for the Memory Session management, apache's apr_global_mutex_create() locking mechanism is being used to lock a file on the filesystem in cases of a mulit-process and multi-threaded apache configuration to share the session between different child processes. Apache's apr_status_t init_mutexes() method is covered with flock() the handling of which in the debian kernel 2.6.12.4 has a bug under severe cases. In our testing, under a very, very heavy load of over 800 concurrent connections using a jython-based tool called grinder, a race condiditon is happening at which point the kernel is panicking. The solution to this problem as we see it is to either change apache's apr_status_t init_mutexes() method ot use fnctl lock instead of flocks, or run apache with only one process at all times. In the meantime, we are going to still moderately load test our cluster through out the weekend and see if we can replicate the panic under a moderate load but a long duration of time. Hozi Jim Gallacher wrote: > Huzaifa Tapal wrote: > >> Hello All, >> >> I was wondering if any of you have run into any kernel panics running >> apache2 w/ mod_python 3.13 (patches applied) on kernel 2.6.12.4? >> Under heavy load testing of approximately 520 concurrent users the >> server crashed with the message: >> >> Kernel panic - Attempting to free lock with active waiting queue >> >> We previously had a stock debian kernel version 2.6.08 and we had >> noticed that under a load of 36 concurrent users, sporadically, the >> server would crash immediately as the load test was started. In the >> current case, after upgrading the kernel, the server did not crash >> until it reached the peak of 550 concurrent users. I am currently >> running the load test again after rebootin the server and the server >> is handling requests even at 559 concurrent users so the panic is >> sporadic. >> >> Anybody have any ideas? Anybody seen anything like this before? > > > Only what google tells me. Any chance you are serving content from nfs? > > Jim > >
|