[mod_python] remnant 'orphan' apache subprocesses

Graham Dumpleton graham.dumpleton at gmail.com
Tue Jan 22 20:15:15 EST 2008


On 23/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > What do you get if you use a program like lsof or ofiles to work out
> > what open resources the zombie process may still be holding on to?
>
> There are 6 zombie sub processes now; executing lsof -p pid takes ages, and
> it brings the load average up from 8.0 to 23+ on this machine- so I am
> afraid to wait long enough to get a result.
> #ps -ef | grep httpd
> root     16197     1  0 Jan21 ?        00:00:15
> nobody   23095 16197  0 08:44 ?        00:00:06
> nobody   29548     1  0 13:14 ?        00:00:00
> nobody    3812     1  0 13:57 ?        00:00:00
> nobody    4161     1  0 13:59 ?        00:00:00
> nobody   20110     1  0 15:43 ?        00:00:00
> nobody   25399     1  0 16:17 ?        00:00:00
> nobody   28722     1  0 16:38 ?        00:00:00
> nobody   28971 16197  5 16:40 ?        00:00:20
> nobody   29189 16197  7 16:42 ?        00:00:21
> nobody   29327 16197  7 16:42 ?        00:00:18
> nobody   29453 16197  6 16:43 ?        00:00:13
> nobody   29496 16197 10 16:43 ?        00:00:20
> nobody   29539 16197  9 16:43 ?        00:00:19
> nobody   29639 16197 11 16:44 ?        00:00:14
> nobody   29713 16197 11 16:45 ?        00:00:12
> nobody   29804 16197  5 16:45 ?        00:00:05
> nobody   29857 16197 10 16:45 ?        00:00:09
> nobody   29902 16197 10 16:45 ?        00:00:08
> nobody   29945 16197 11 16:46 ?        00:00:07
> nobody   29998 16197 11 16:46 ?        00:00:06
> nobody   30058 16197 16 16:47 ?        00:00:01
>
> note that those zombie sub processes seem to have had 00:00:00 run time,
> unlike normal sub processes.
> 3 entries in apache error logs this time:
>
> [Tue Jan 22 10:44:45 2008] [notice] child pid 8798 exit signal Segmentation
> fault (11)
> ...
> Fatal Python error: Inconsistent interned string state.

This is corruption of memory used by Python.

What version of mod_python are you using?

> [Tue Jan 22 14:03:16 2008] [notice] child pid 4008 exit signal Aborted (6)
>
> is there a faster way to see what it is holding on to?

I know you said the applications weren't spawning sub processes, but
if your system has 'ptree' try that. In other words, see if those
daemon processes have children which still exist. This was in part
what I was hoping to see, ie., pipes to child processes. The other
thing I was looking for was stuck file accesses to NFS mounted
filesystems or something. Alternative to ptree is just to look at
parent child relationships in ps output.

Graham

> > -----Original Message-----
> > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > Sent: Monday, January 21, 2008 10:27 PM
> > To: Alec Matusis
> > Cc: mod_python at modpython.org
> > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses
> >
> > What do you get if you use a program like lsof or ofiles to work out
> > what open resources the zombie process may still be holding on to?
> >
> > Are you absolutely sure that that zombie process is from the current
> > Apache instance and not perhaps an earlier instance of Apache?
> >
> > Graham
> >
> > On 22/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > > > Are CGI scripts used anywhere at all on your Apache web site?
> > >
> > > No, only mod_python and serving static files.
> > >
> > > > -----Original Message-----
> > > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > > > Sent: Monday, January 21, 2008 10:07 PM
> > > > To: Alec Matusis
> > > > Cc: mod_python at modpython.org
> > > > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses
> > > >
> > > > Are CGI scripts used anywhere at all on your Apache web site?
> > > >
> > > > On 22/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > > > > > What version of Apache are you using?
> > > > >
> > > > > 2.2.6
> > > > >
> > > > > > What Python web application are you running on top of
> > mod_python, a
> > > > > > self built one or one that uses one of the larger web
> > frameworks?
> > > > >
> > > > > Only self-built stuff, nothing complicated.
> > > > >
> > > > > > Does your application create sub processes in any way to
> > perform
> > > > > > additional work?
> > > > >
> > > > > No sub processes and no threads, except that we use MySQLdb
> > module
> > > > (which
> > > > > might create threads?).
> > > > >
> > > > > I noticed a warning in the error log:
> > > > > /live/scripts/_pro.py:100: Warning: Rows matched: 1  Changed: 1
> > > > Warnings: 1
> > > > > (this is a mysql warning), but I would not think this is
> > relevant...
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > > > > > Sent: Monday, January 21, 2008 9:22 PM
> > > > > > To: Alec Matusis
> > > > > > Cc: mod_python at modpython.org
> > > > > > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses
> > > > > >
> > > > > > What version of Apache are you using?
> > > > > >
> > > > > > What Python web application are you running on top of
> > mod_python, a
> > > > > > self built one or one that uses one of the larger web
> > frameworks?
> > > > > >
> > > > > > Does your application create sub processes in any way to
> > perform
> > > > > > additional work?
> > > > > >
> > > > > > Graham
> > > > > >
> > > > > > On 22/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > > > > > > I have been investigating a memory leak that occurs on  an
> > apache
> > > > > > server
> > > > > > > since we switched to worker MPM.
> > > > > > > I found that the source of it are apache subprocesses that
> > lose
> > > > track
> > > > > > of
> > > > > > > their parent and never exit:
> > > > > > >
> > > > > > > root at web10 ~> ps -ef | grep httpd
> > > > > > > root     16197     1  0 02:00 ?        00:00:09
> > > > > > > /usr/local/encap/httpd/bin/httpd -f /p2/web/conf/web10.conf -
> > k
> > > > start
> > > > > > > nobody   17750     1  0 17:53 ?        00:00:00
> > > > > > > /usr/local/encap/httpd/bin/httpd -f /p2/web/conf/web10.conf -
> > k
> > > > start
> > > > > > > nobody    5112 16197  4 20:02 ?        00:00:16
> > > > > > > /usr/local/encap/httpd/bin/httpd -f /p2/web/conf/web10.conf -
> > k
> > > > start
> > > > > > > nobody    5159 16197  4 20:02 ?        00:00:15
> > > > > > > /usr/local/encap/httpd/bin/httpd -f /p2/web/conf/web10.conf -
> > k
> > > > start
> > > > > > > nobody    5300 16197  4 20:03 ?        00:00:14
> > > > > > > /usr/local/encap/httpd/bin/httpd -f /p2/web/conf/web10.conf -
> > k
> > > > start
> > > > > > >
> > > > > > >
> > > > > > > in this output, apache child pid 17750 has pid 1 as a parent,
> > and
> > > > it
> > > > > > is one
> > > > > > > of those 'zombie children'.
> > > > > > > Pids 5112,  5159, 5300 were normal (parent is pid 16197), and
> > > > they
> > > > > > exited
> > > > > > > after MaxRequestsPerChild was reached.
> > > > > > >
> > > > > > > Does anybody have any advice on this? I cannot correlate this
> > to
> > > > > > anything,
> > > > > > > there's nothing interesting in the server error log.
> > > > > > > These 'zombies' appear at a rate of 2-3 per day; this apache
> > > > serves
> > > > > > about
> > > > > > > 350 requests per second.
> > > > > > >
> > > > > > > This Apache configuration is
> > > > > > >
> > > > > > > ServerLimit 40
> > > > > > > ThreadLimit 70
> > > > > > >
> > > > > > > StartServers 10
> > > > > > > MaxClients 1600
> > > > > > > MinSpareThreads 75
> > > > > > > MaxSpareThreads 200
> > > > > > > ThreadsPerChild 40
> > > > > > > MaxRequestsPerChild 10000
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Mod_python mailing list
> > > > > > > Mod_python at modpython.org
> > > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > > > > >
> > > > >
> > > > >
> > >
> > >
>
>


More information about the Mod_python mailing list