|
Alec Matusis
matusis at yahoo.com
Thu Jan 24 00:59:02 EST 2008
> Which almost suggests it was actually a fork of an Apache child
> process as that is possibly only way the other threads could have been
> killed off.
I think this is right. I caught this evidence:
# ps -ef | grep http
root 16197 1 0 Jan21 ?
nobody 787 16197 9 18:06 ?
nobody 1096 16197 9 18:09 ?
nobody 1240 16197 8 18:12 ?
nobody 1389 16197 8 18:13 ?
nobody 1525 16197 9 18:14 ?
nobody 1732 1525 0 18:16 ?
....
Note pid 1732 with 0 CPU total being a child of a sub process (pid 1525)
20 minutes later, the parent sub process pid 1525 exited, and pid 1732
became an orphan:
# ps -ef | grep http
root 16197 1 0 Jan21 ?
nobody 1732 1 0 18:16 ?
nobody 9895 16197 6 20:07 ?
.....
> I can't find though any evidence of fork() being used direct in PIL
> code at least, but then the popen2 module uses fork() in Popen3 and
> Popen4 classes.
I found that we send email like this:
try:
handle = os.popen2('/usr/sbin/sendmail.postfix -f mailzero -t')
handle[0].write(msg)
handle[0].write("\n.\n")
finally :
if handle:
handle[0].close()
handle[1].close()
Could this be the culprit?
PS. I finally obtained lsof for an orphan:
# lsof -p 552
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
httpd 552 nobody cwd DIR 3,1 560 2 /
httpd 552 nobody rtd DIR 3,1 560 2 /
httpd 552 nobody txt REG 3,3 1459487 120154
/web10/encap/httpd-2.2.6/bin/httpd
httpd 552 nobody mem REG 3,1 108173 10583
/lib64/ld-2.3.4.so
httpd 552 nobody mem REG 3,1 208464 13742
/usr/lib/locale/en_US.utf8/LC_CTYPE
httpd 552 nobody mem REG 3,1 21546 67725
/usr/lib64/gconv/gconv-modules.cache
httpd 552 nobody mem REG 3,1 422191 149865
/lib64/tls/libm.so.6
httpd 552 nobody mem REG 3,3 359952 120180
/web10/encap/httpd-2.2.6/lib/libaprutil-1.so.0.2.11
httpd 552 nobody mem REG 3,3 399356 86545
/web10/encap/expat-1.95.8/lib/libexpat.so.0.5.0
httpd 552 nobody mem REG 3,3 644280 120177
/web10/encap/httpd-2.2.6/lib/libapr-1.so.0.2.11
httpd 552 nobody mem REG 3,1 44393 149867
/lib64/tls/librt.so.1
httpd 552 nobody mem REG 3,1 49785 10590
/lib64/libcrypt.so.1
httpd 552 nobody mem REG 3,1 104333 149866
/lib64/tls/libpthread.so.0
httpd 552 nobody mem REG 3,1 15367 149856
/lib64/libdl.so.2
httpd 552 nobody mem REG 3,1 1446783 149864
/lib64/tls/libc.so.6
httpd 552 nobody mem REG 3,3 32293 120510
/web10/encap/httpd-2.2.6/modules/mod_authz_host.so
httpd 552 nobody mem REG 3,3 49742 120526
/web10/encap/httpd-2.2.6/modules/mod_mime.so
httpd 552 nobody mem REG 3,3 80479 120514
/web10/encap/httpd-2.2.6/modules/mod_log_config.so
httpd 552 nobody mem REG 3,3 41745 120525
/web10/encap/httpd-2.2.6/modules/mod_alias.so
httpd 552 nobody mem REG 3,3 29404 120540
/web10/encap/httpd-2.2.6/modules/mod_env.so
httpd 552 nobody mem REG 3,3 38369 120492
/web10/encap/httpd-2.2.6/modules/mod_setenvif.so
httpd 552 nobody mem REG 3,3 104277 120499
/web10/encap/httpd-2.2.6/modules/mod_negotiation.so
httpd 552 nobody mem REG 3,3 146518 120535
/web10/encap/httpd-2.2.6/modules/mod_rewrite.so
httpd 552 nobody mem REG 3,3 52083 120520
/web10/encap/httpd-2.2.6/modules/mod_status.so
httpd 552 nobody mem REG 3,3 52373 120503
/web10/encap/httpd-2.2.6/modules/mod_info.so
httpd 552 nobody mem REG 3,3 4984833 120515
/web10/encap/httpd-2.2.6/modules/mod_python.so
httpd 552 nobody mem REG 3,1 14784 10606
/lib64/libutil.so.1
httpd 552 nobody mem REG 3,3 38039 120494
/web10/encap/httpd-2.2.6/modules/mod_expires.so
httpd 552 nobody mem REG 3,1 217016 69589
/var/run/nscd/passwd
httpd 552 nobody mem REG 3,1 217016 69590
/var/run/nscd/group
httpd 552 nobody DEL REG 3,1 69591
/var/run/nscd/dbMxLNyI
httpd 552 nobody DEL REG 0,7 18446744073632515035
/dev/zero
httpd 552 nobody mem REG 3,1 46664 16459
/lib64/libgcc_s.so.1
httpd 552 nobody mem REG 3,3 49339 2814
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/time.so
httpd 552 nobody mem REG 3,3 70274 2838
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/strop.so
httpd 552 nobody mem REG 3,3 101469 2839
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/itertools.so
httpd 552 nobody mem REG 3,3 54133 2791
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/cStringIO.so
httpd 552 nobody mem REG 3,3 138480 2837
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_socket.so
httpd 552 nobody mem REG 3,3 74923 2819
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_ssl.so
httpd 552 nobody mem REG 3,1 245682 141938
/usr/lib64/libssl.so.0.9.7
httpd 552 nobody mem REG 3,1 1441278 59209
/usr/lib64/libcrypto.so.0.9.7
httpd 552 nobody mem REG 3,3 46577 2811
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/math.so
httpd 552 nobody mem REG 3,3 56810 2833
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/binascii.so
httpd 552 nobody mem REG 3,3 36773 2818
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_random.so
httpd 552 nobody mem REG 3,3 37259 2821
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/fcntl.so
httpd 552 nobody mem REG 3,3 62744 2803
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/collections.so
httpd 552 nobody mem REG 3,3 21960 2807
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/syslog.so
httpd 552 nobody mem REG 3,3 82224 2802
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/struct.so
httpd 552 nobody mem REG 3,3 264023 2815
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/datetime.so
httpd 552 nobody mem REG 3,3 34866 2823
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/md5.so
httpd 552 nobody mem REG 3,3 34443 2827
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/sha.so
httpd 552 nobody mem REG 3,3 28020 2832
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_bisect.so
httpd 552 nobody mem REG 3,3 21455 2795
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_weakref.so
httpd 552 nobody mem REG 3,3 154790 135012
/web10/encap/MySQL-python-1.2.0/lib/python2.4/site-packages/_mysql.so
httpd 552 nobody mem REG 3,3 477383 133941
/web10/encap/mysql-4.1.15/lib/mysql/libmysqlclient_r.so.14.0.0
httpd 552 nobody mem REG 3,3 95574 133804
/web10/encap/mysql-4.1.15/lib/mysql/libz.so.0.0.0
httpd 552 nobody mem REG 3,1 101209 10594
/lib64/libnsl.so.1
httpd 552 nobody mem REG 3,3 119746 2831
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/array.so
httpd 552 nobody mem REG 3,1 53431 149860
/lib64/libnss_files.so.2
httpd 552 nobody mem REG 3,3 28411 9381
/web10/encap/Cheetah-0.9.15-nogeninfo/lib/python2.4/site-packages/Cheetah/_n
amemapper.so
httpd 552 nobody mem REG 3,3 76289 9344
/web10/encap/pycrypto-2.0.1/lib/python2.4/site-packages/Crypto/Cipher/DES3.s
o
httpd 552 nobody mem REG 3,3 940471 9223
/web10/encap/Imaging-1.1.5/lib/python2.4/site-packages/PIL/_imaging.so
httpd 552 nobody mem REG 3,1 137928 14016
/usr/lib64/libjpeg.so.62.0.0
httpd 552 nobody mem REG 3,1 77960 14723
/lib64/libz.so.1.2.2
httpd 552 nobody mem REG 3,3 64166 2843
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/operator.so
httpd 552 nobody mem REG 3,3 44969 9267
/web10/encap/Imaging-1.1.5/lib/python2.4/site-packages/PIL/_imagingft.so
httpd 552 nobody mem REG 3,1 588288 130060
/usr/lib64/libfreetype.so.6.3.7
httpd 552 nobody mem REG 3,3 56595 2813
/web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/zlib.so
httpd 552 nobody 0r FIFO 0,5 4292655466 pipe
httpd 552 nobody 1w FIFO 0,5 4292655467 pipe
httpd 552 nobody 2w REG 3,3 21074695 86518
/web10/web/logs/web10_error_log
> -----Original Message-----
> From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> Sent: Tuesday, January 22, 2008 8:36 PM
> To: Alec Matusis
> Cc: mod_python at modpython.org
> Subject: Re: [mod_python] remnant 'orphan' apache subprocesses
>
> On 23/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > > Hmmm, there should have been 39 other stack traces, one for each
> > > worker thread based on your Apache configuration.
> >
> > Looks like it has only one thread:
>
> Which almost suggests it was actually a fork of an Apache child
> process as that is possibly only way the other threads could have been
> killed off. This would sort of sit with idea that some code had tried
> to further fork and daemonise the process. This is the only thing that
> seems to fit at the moment.
>
> I can't find though any evidence of fork() being used direct in PIL
> code at least, but then the popen2 module uses fork() in Popen3 and
> Popen4 classes. Again, PIL, although it uses os.popen() doesn't use
> popen2 module. Also, that doesn't involved setsid(). The only place in
> standard Python modules that appears to use setsid() is pty.fork() and
> PIL doesn't use that.
>
> All I can suggest is you might in your loaded application dump out
> sys.modules and see if anything has imported pty module. Otherwise
> troll through all the Python code you know is used and see if anything
> for whatever reason uses fork(), setsid(), popen2 module, or
> pty.fork(). More nasty would be a C extension module doing forks and
> trying to do Python stuff in forked process but not doing necessary
> Python cleanup after the fork. Would have expected to see a C
> extension module stack frame in traceback if that was the case though.
>
> Anyway, running out of ideas.
>
> Graham
>
> > Orphaned sub process:
> >
> > # ps -mTp 3812
> > PID SPID TTY TIME CMD
> > 3812 - ? 00:00:00 httpd
> > - 3812 - 00:00:00 -
> >
> > Normal sub-process:
> >
> > # ps -mTp 20205
> > PID SPID TTY TIME CMD
> > 20205 - ? 00:00:10 httpd
> > - 20205 - 00:00:00 -
> > - 20207 - 00:00:00 -
> > - 20208 - 00:00:00 -
> > - 20209 - 00:00:00 -
> > - 20210 - 00:00:00 -
> > - 20211 - 00:00:00 -
> > - 20212 - 00:00:00 -
> > - 20213 - 00:00:00 -
> > - 20214 - 00:00:00 -
> > - 20215 - 00:00:00 -
> > - 20216 - 00:00:00 -
> > - 20217 - 00:00:00 -
> > - 20218 - 00:00:00 -
> > - 20219 - 00:00:00 -
> > - 20220 - 00:00:00 -
> > - 20221 - 00:00:00 -
> > - 20222 - 00:00:00 -
> > - 20223 - 00:00:00 -
> > - 20224 - 00:00:00 -
> > - 20225 - 00:00:00 -
> > - 20226 - 00:00:00 -
> > - 20227 - 00:00:00 -
> > - 20228 - 00:00:00 -
> > - 20229 - 00:00:00 -
> > - 20230 - 00:00:00 -
> > - 20231 - 00:00:00 -
> > - 20232 - 00:00:00 -
> > - 20233 - 00:00:00 -
> > - 20234 - 00:00:00 -
> > - 20235 - 00:00:00 -
> > - 20236 - 00:00:00 -
> > - 20237 - 00:00:00 -
> > - 20238 - 00:00:00 -
> > - 20239 - 00:00:00 -
> > - 20240 - 00:00:00 -
> > - 20241 - 00:00:00 -
> > - 20242 - 00:00:00 -
> > - 20243 - 00:00:00 -
> > - 20244 - 00:00:00 -
> > - 20245 - 00:00:00 -
> > - 20246 - 00:00:00 -
> > - 20247 - 00:00:00 -
> >
> >
> > > -----Original Message-----
> > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > > Sent: Tuesday, January 22, 2008 7:00 PM
> > > To: Alec Matusis
> > > Cc: mod_python at modpython.org
> > > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses
> > >
> > > On 23/01/2008, Alec Matusis <matusis at yahoo.com> wrote:
> > > > > (gdb) thread apply all bt
> > > >
> > > > Thread 1 (Thread 1111562592 (LWP 3812)):
> > > > #0 0x00002aaaab2c94df in sem_wait () from
> /lib64/tls/libpthread.so.0
> > > > #1 0x00002aaaac204dcd in PyThread_acquire_lock (lock=0x5b7820,
> > > waitflag=1)
> > >
> > > Hmmm, there should have been 39 other stack traces, one for each
> > > worker thread based on your Apache configuration.
> > >
> > > Don't necessarily post them all, but you should look and see what
> is
> > > in frame #0 for all of them and see if there is anything odd such
> as
> > > one blocking in a call out to some external C library. If no such
> > > thing and they are all stuck on Python mutex locks, then must be
> > > internal deadlock.
> > >
> > > This still doesn't explain though how the process has been
> associated
> > > with PPID of 1, which would still only seem possible if something
> had
> > > tried to daemonise process.
> > >
> > > Graham
> > >
> > > Graham
> >
> >
|