Alec Matusis
matusis at yahoo.com
Thu Jan 24 00:59:02 EST 2008
> Which almost suggests it was actually a fork of an Apache child > process as that is possibly only way the other threads could have been > killed off. I think this is right. I caught this evidence: # ps -ef | grep http root 16197 1 0 Jan21 ? nobody 787 16197 9 18:06 ? nobody 1096 16197 9 18:09 ? nobody 1240 16197 8 18:12 ? nobody 1389 16197 8 18:13 ? nobody 1525 16197 9 18:14 ? nobody 1732 1525 0 18:16 ? .... Note pid 1732 with 0 CPU total being a child of a sub process (pid 1525) 20 minutes later, the parent sub process pid 1525 exited, and pid 1732 became an orphan: # ps -ef | grep http root 16197 1 0 Jan21 ? nobody 1732 1 0 18:16 ? nobody 9895 16197 6 20:07 ? ..... > I can't find though any evidence of fork() being used direct in PIL > code at least, but then the popen2 module uses fork() in Popen3 and > Popen4 classes. I found that we send email like this: try: handle = os.popen2('/usr/sbin/sendmail.postfix -f mailzero -t') handle[0].write(msg) handle[0].write("\n.\n") finally : if handle: handle[0].close() handle[1].close() Could this be the culprit? PS. I finally obtained lsof for an orphan: # lsof -p 552 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME httpd 552 nobody cwd DIR 3,1 560 2 / httpd 552 nobody rtd DIR 3,1 560 2 / httpd 552 nobody txt REG 3,3 1459487 120154 /web10/encap/httpd-2.2.6/bin/httpd httpd 552 nobody mem REG 3,1 108173 10583 /lib64/ld-2.3.4.so httpd 552 nobody mem REG 3,1 208464 13742 /usr/lib/locale/en_US.utf8/LC_CTYPE httpd 552 nobody mem REG 3,1 21546 67725 /usr/lib64/gconv/gconv-modules.cache httpd 552 nobody mem REG 3,1 422191 149865 /lib64/tls/libm.so.6 httpd 552 nobody mem REG 3,3 359952 120180 /web10/encap/httpd-2.2.6/lib/libaprutil-1.so.0.2.11 httpd 552 nobody mem REG 3,3 399356 86545 /web10/encap/expat-1.95.8/lib/libexpat.so.0.5.0 httpd 552 nobody mem REG 3,3 644280 120177 /web10/encap/httpd-2.2.6/lib/libapr-1.so.0.2.11 httpd 552 nobody mem REG 3,1 44393 149867 /lib64/tls/librt.so.1 httpd 552 nobody mem REG 3,1 49785 10590 /lib64/libcrypt.so.1 httpd 552 nobody mem REG 3,1 104333 149866 /lib64/tls/libpthread.so.0 httpd 552 nobody mem REG 3,1 15367 149856 /lib64/libdl.so.2 httpd 552 nobody mem REG 3,1 1446783 149864 /lib64/tls/libc.so.6 httpd 552 nobody mem REG 3,3 32293 120510 /web10/encap/httpd-2.2.6/modules/mod_authz_host.so httpd 552 nobody mem REG 3,3 49742 120526 /web10/encap/httpd-2.2.6/modules/mod_mime.so httpd 552 nobody mem REG 3,3 80479 120514 /web10/encap/httpd-2.2.6/modules/mod_log_config.so httpd 552 nobody mem REG 3,3 41745 120525 /web10/encap/httpd-2.2.6/modules/mod_alias.so httpd 552 nobody mem REG 3,3 29404 120540 /web10/encap/httpd-2.2.6/modules/mod_env.so httpd 552 nobody mem REG 3,3 38369 120492 /web10/encap/httpd-2.2.6/modules/mod_setenvif.so httpd 552 nobody mem REG 3,3 104277 120499 /web10/encap/httpd-2.2.6/modules/mod_negotiation.so httpd 552 nobody mem REG 3,3 146518 120535 /web10/encap/httpd-2.2.6/modules/mod_rewrite.so httpd 552 nobody mem REG 3,3 52083 120520 /web10/encap/httpd-2.2.6/modules/mod_status.so httpd 552 nobody mem REG 3,3 52373 120503 /web10/encap/httpd-2.2.6/modules/mod_info.so httpd 552 nobody mem REG 3,3 4984833 120515 /web10/encap/httpd-2.2.6/modules/mod_python.so httpd 552 nobody mem REG 3,1 14784 10606 /lib64/libutil.so.1 httpd 552 nobody mem REG 3,3 38039 120494 /web10/encap/httpd-2.2.6/modules/mod_expires.so httpd 552 nobody mem REG 3,1 217016 69589 /var/run/nscd/passwd httpd 552 nobody mem REG 3,1 217016 69590 /var/run/nscd/group httpd 552 nobody DEL REG 3,1 69591 /var/run/nscd/dbMxLNyI httpd 552 nobody DEL REG 0,7 18446744073632515035 /dev/zero httpd 552 nobody mem REG 3,1 46664 16459 /lib64/libgcc_s.so.1 httpd 552 nobody mem REG 3,3 49339 2814 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/time.so httpd 552 nobody mem REG 3,3 70274 2838 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/strop.so httpd 552 nobody mem REG 3,3 101469 2839 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/itertools.so httpd 552 nobody mem REG 3,3 54133 2791 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/cStringIO.so httpd 552 nobody mem REG 3,3 138480 2837 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_socket.so httpd 552 nobody mem REG 3,3 74923 2819 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_ssl.so httpd 552 nobody mem REG 3,1 245682 141938 /usr/lib64/libssl.so.0.9.7 httpd 552 nobody mem REG 3,1 1441278 59209 /usr/lib64/libcrypto.so.0.9.7 httpd 552 nobody mem REG 3,3 46577 2811 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/math.so httpd 552 nobody mem REG 3,3 56810 2833 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/binascii.so httpd 552 nobody mem REG 3,3 36773 2818 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_random.so httpd 552 nobody mem REG 3,3 37259 2821 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/fcntl.so httpd 552 nobody mem REG 3,3 62744 2803 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/collections.so httpd 552 nobody mem REG 3,3 21960 2807 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/syslog.so httpd 552 nobody mem REG 3,3 82224 2802 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/struct.so httpd 552 nobody mem REG 3,3 264023 2815 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/datetime.so httpd 552 nobody mem REG 3,3 34866 2823 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/md5.so httpd 552 nobody mem REG 3,3 34443 2827 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/sha.so httpd 552 nobody mem REG 3,3 28020 2832 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_bisect.so httpd 552 nobody mem REG 3,3 21455 2795 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/_weakref.so httpd 552 nobody mem REG 3,3 154790 135012 /web10/encap/MySQL-python-1.2.0/lib/python2.4/site-packages/_mysql.so httpd 552 nobody mem REG 3,3 477383 133941 /web10/encap/mysql-4.1.15/lib/mysql/libmysqlclient_r.so.14.0.0 httpd 552 nobody mem REG 3,3 95574 133804 /web10/encap/mysql-4.1.15/lib/mysql/libz.so.0.0.0 httpd 552 nobody mem REG 3,1 101209 10594 /lib64/libnsl.so.1 httpd 552 nobody mem REG 3,3 119746 2831 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/array.so httpd 552 nobody mem REG 3,1 53431 149860 /lib64/libnss_files.so.2 httpd 552 nobody mem REG 3,3 28411 9381 /web10/encap/Cheetah-0.9.15-nogeninfo/lib/python2.4/site-packages/Cheetah/_n amemapper.so httpd 552 nobody mem REG 3,3 76289 9344 /web10/encap/pycrypto-2.0.1/lib/python2.4/site-packages/Crypto/Cipher/DES3.s o httpd 552 nobody mem REG 3,3 940471 9223 /web10/encap/Imaging-1.1.5/lib/python2.4/site-packages/PIL/_imaging.so httpd 552 nobody mem REG 3,1 137928 14016 /usr/lib64/libjpeg.so.62.0.0 httpd 552 nobody mem REG 3,1 77960 14723 /lib64/libz.so.1.2.2 httpd 552 nobody mem REG 3,3 64166 2843 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/operator.so httpd 552 nobody mem REG 3,3 44969 9267 /web10/encap/Imaging-1.1.5/lib/python2.4/site-packages/PIL/_imagingft.so httpd 552 nobody mem REG 3,1 588288 130060 /usr/lib64/libfreetype.so.6.3.7 httpd 552 nobody mem REG 3,3 56595 2813 /web10/encap/Python-2.4.1/lib/python2.4/lib-dynload/zlib.so httpd 552 nobody 0r FIFO 0,5 4292655466 pipe httpd 552 nobody 1w FIFO 0,5 4292655467 pipe httpd 552 nobody 2w REG 3,3 21074695 86518 /web10/web/logs/web10_error_log > -----Original Message----- > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] > Sent: Tuesday, January 22, 2008 8:36 PM > To: Alec Matusis > Cc: mod_python at modpython.org > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses > > On 23/01/2008, Alec Matusis <matusis at yahoo.com> wrote: > > > Hmmm, there should have been 39 other stack traces, one for each > > > worker thread based on your Apache configuration. > > > > Looks like it has only one thread: > > Which almost suggests it was actually a fork of an Apache child > process as that is possibly only way the other threads could have been > killed off. This would sort of sit with idea that some code had tried > to further fork and daemonise the process. This is the only thing that > seems to fit at the moment. > > I can't find though any evidence of fork() being used direct in PIL > code at least, but then the popen2 module uses fork() in Popen3 and > Popen4 classes. Again, PIL, although it uses os.popen() doesn't use > popen2 module. Also, that doesn't involved setsid(). The only place in > standard Python modules that appears to use setsid() is pty.fork() and > PIL doesn't use that. > > All I can suggest is you might in your loaded application dump out > sys.modules and see if anything has imported pty module. Otherwise > troll through all the Python code you know is used and see if anything > for whatever reason uses fork(), setsid(), popen2 module, or > pty.fork(). More nasty would be a C extension module doing forks and > trying to do Python stuff in forked process but not doing necessary > Python cleanup after the fork. Would have expected to see a C > extension module stack frame in traceback if that was the case though. > > Anyway, running out of ideas. > > Graham > > > Orphaned sub process: > > > > # ps -mTp 3812 > > PID SPID TTY TIME CMD > > 3812 - ? 00:00:00 httpd > > - 3812 - 00:00:00 - > > > > Normal sub-process: > > > > # ps -mTp 20205 > > PID SPID TTY TIME CMD > > 20205 - ? 00:00:10 httpd > > - 20205 - 00:00:00 - > > - 20207 - 00:00:00 - > > - 20208 - 00:00:00 - > > - 20209 - 00:00:00 - > > - 20210 - 00:00:00 - > > - 20211 - 00:00:00 - > > - 20212 - 00:00:00 - > > - 20213 - 00:00:00 - > > - 20214 - 00:00:00 - > > - 20215 - 00:00:00 - > > - 20216 - 00:00:00 - > > - 20217 - 00:00:00 - > > - 20218 - 00:00:00 - > > - 20219 - 00:00:00 - > > - 20220 - 00:00:00 - > > - 20221 - 00:00:00 - > > - 20222 - 00:00:00 - > > - 20223 - 00:00:00 - > > - 20224 - 00:00:00 - > > - 20225 - 00:00:00 - > > - 20226 - 00:00:00 - > > - 20227 - 00:00:00 - > > - 20228 - 00:00:00 - > > - 20229 - 00:00:00 - > > - 20230 - 00:00:00 - > > - 20231 - 00:00:00 - > > - 20232 - 00:00:00 - > > - 20233 - 00:00:00 - > > - 20234 - 00:00:00 - > > - 20235 - 00:00:00 - > > - 20236 - 00:00:00 - > > - 20237 - 00:00:00 - > > - 20238 - 00:00:00 - > > - 20239 - 00:00:00 - > > - 20240 - 00:00:00 - > > - 20241 - 00:00:00 - > > - 20242 - 00:00:00 - > > - 20243 - 00:00:00 - > > - 20244 - 00:00:00 - > > - 20245 - 00:00:00 - > > - 20246 - 00:00:00 - > > - 20247 - 00:00:00 - > > > > > > > -----Original Message----- > > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] > > > Sent: Tuesday, January 22, 2008 7:00 PM > > > To: Alec Matusis > > > Cc: mod_python at modpython.org > > > Subject: Re: [mod_python] remnant 'orphan' apache subprocesses > > > > > > On 23/01/2008, Alec Matusis <matusis at yahoo.com> wrote: > > > > > (gdb) thread apply all bt > > > > > > > > Thread 1 (Thread 1111562592 (LWP 3812)): > > > > #0 0x00002aaaab2c94df in sem_wait () from > /lib64/tls/libpthread.so.0 > > > > #1 0x00002aaaac204dcd in PyThread_acquire_lock (lock=0x5b7820, > > > waitflag=1) > > > > > > Hmmm, there should have been 39 other stack traces, one for each > > > worker thread based on your Apache configuration. > > > > > > Don't necessarily post them all, but you should look and see what > is > > > in frame #0 for all of them and see if there is anything odd such > as > > > one blocking in a call out to some external C library. If no such > > > thing and they are all stuck on Python mutex locks, then must be > > > internal deadlock. > > > > > > This still doesn't explain though how the process has been > associated > > > with PPID of 1, which would still only seem possible if something > had > > > tried to daemonise process. > > > > > > Graham > > > > > > Graham > > > >
|