[mod_python] Re: child process XXXX still did not exit, sending a SIGTERM

Martin Blais blais at furius.ca
Mon Jan 2 16:28:55 EST 2006


Thanks Graham.

Looks like a new directive only available in 2.2...   will update now.

More info:  I'm starting to hack into mod_python to find out what's
happening there, and python_finalize() is not called for the offending
processes.





On 1/2/06, Graham Dumpleton <grahamd at dscpl.com.au> wrote:
> Instead of using "apachectl restart", try using "apachectl graceful" to see
> if that makes any difference.
>
> You may also want to adjust GracefulShutdownTimeout for this case to
> only allow a certain time for child processes to shutdown.
>
>   http://httpd.apache.org/docs/2.2/mod/mpm_common.html#gracefulshutdowntimeout
>
> By fiddling with this, you may be able to see if given time that the child
> processes do actually shutdown, or whether they actually hang. If they
> hang, then you may be able to attach using gdb to the child process and
> work out where it is hanging.
>
> It may just be a case that with Python loaded it takes longer to shutdown
> than Apache expects it to and so it abruptly kills it. I haven't been able
> to find any directive for specifying a shutdown timeout for "restart" which
> it waits before it kills the child processes.
>
> Graham
>
> Martin Blais wrote ..
> > On 1/2/06, Martin Blais <blais at furius.ca> wrote:
> > > Before someone asks, I did not get seg faults in my log, so that's not
> > > the problem.
> > >
> > > On closer inspection though, I do have a few SSL error messages like
> > these:
> > >
> > > [Mon Jan 02 15:07:30 2006] [info] (70014)End of file found: SSL
> > > handshake interrupted by system [Hint: Stop button pressed in
> > > browser?!]
> > > [Mon Jan 02 15:07:30 2006] [info] Connection to child 0 closed with
> > > abortive shutdown(server furius.dyndns.biz:443, client 127.0.0.1)
> > > [Mon Jan 02 15:07:31 2006] [info] Connection to child 9 established
> > > (server furius.dyndns.biz:443, client 127.0.0.1)
> > > [Mon Jan 02 15:07:31 2006] [info] Seeding PRNG with 136 bytes of entropy
> > >
> > > ... the number of which does not match the number of zombie children
> > > (does not mean that it's not the problem though).
> > >
> > > Any information/tips welcome.
> >
> > It's not the SSL errors, I just reproduced the zombie child error
> > without getting the SSL errors (it's my automated test code that
> > somehow creates the SSL problems, fiddling a lot with the browser does
> > not cause SSL errors, but will result in children not wanting to die).
> >
> >
> > >
> > >
> > > On 1/2/06, Martin Blais <blais at furius.ca> wrote:
> > > > Hi
> > > >
> > > > I'm getting messages like this in my apache2 log when I stop it after
> > > > running requests to a mod_python-based web app:
> > > >
> > > > ==> /var/log/apache2/error_log <==
> > > > [Mon Jan 02 13:04:12 2006] [warn] child process 21806 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:12 2006] [warn] child process 21768 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:12 2006] [warn] child process 21792 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:14 2006] [warn] child process 21806 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:14 2006] [warn] child process 21768 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:14 2006] [warn] child process 21792 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:16 2006] [warn] child process 21806 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:16 2006] [warn] child process 21768 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:16 2006] [warn] child process 21792 still did not
> > > > exit, sending a SIGTERM
> > > > [Mon Jan 02 13:04:18 2006] [error] child process 21806 still did not
> > > > exit, sending a SIGKILL
> > > > [Mon Jan 02 13:04:18 2006] [error] child process 21768 still did not
> > > > exit, sending a SIGKILL
> > > > [Mon Jan 02 13:04:18 2006] [error] child process 21792 still did not
> > > > exit, sending a SIGKILL
> > > > [Mon Jan 02 13:04:19 2006] [info] removed PID file
> > > > /var/run/apache2.pid (pid=21754)
> > > > [Mon Jan 02 13:04:19 2006] [notice] caught SIGTERM, shutting down
> > > >
> > > > I've been tracking the problem, and on exit, the cleanup function (in
> > > > Python) that I registered with the mod_python server does get called,
> > > > and exits.
> > > >
> > > > I saw somewhere that recompiling expat, and then all the rest (in my
> > > > case, python-2.4.2, apache2, mod_python-3.1.4, psycopg2) might fix
> > the
> > > > issue (it would be due to a library incompatibility).  I did this,
> > and
> > > > the problem still occurs.  I also tried using mod_python-3.2.5b and
> > > > the problem persists.  This is all running on Linux (Gentoo).
> > > >
> > > > Any clue on how to debug this issue further?  I'm currently trying
> > to
> > > > run gdb on a non-daemon apache2, or trying to attach to a running
> > > > child.  The problem is a bit tricky to fix, since it does not happen
> > > > always -- but always happens at least after I run a lot of requests
> > > > e.g. my automated test suite for my web app.
> > > >
> > > > Any hints would be appreciated.
> > > >
> > > > cheers,
> > > >
> > >
> >
> > _______________________________________________
> > Mod_python mailing list
> > Mod_python at modpython.org
> > http://mailman.modpython.org/mailman/listinfo/mod_python
>



More information about the Mod_python mailing list