Alec Matusis
matusis at yahoo.com
Wed Nov 7 16:23:28 EST 2007
I got it to work, thanks! The performance gain is tremendous: The 4GB machine now hasabout 3GB available RAM (1.5 free and 1.5 in cache/buffers). With prefork, I managed to hit the swap (with ServerLimit 430). Load average actually dropped from 2.5 with prefork to 1.2 with worker. DB machine load average did not change. I am glad we are using mod_python, because I believe mod_php for example cannot be used with prefork. Our problem was that we used sys.modules trying to reuse imported modules, which was a bad idea with worker MPM. > -----Original Message----- > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] > Sent: Tuesday, November 06, 2007 9:49 PM > To: Alec Matusis > Cc: mod_python at modpython.org > Subject: Re: [mod_python] mod_python or apache scalability? > > On 07/11/2007, Alec Matusis <matusis at yahoo.com> wrote: > > After I wrote this, I realized that the problem was in > > > > from sys import modules > > def _import( module_path ): > > runme = modules[ module_path ] > > > > sys.modules was trying to use already loaded modules. > > I got rid of it, and forced it to actually load modules every time, > which > > got rid of errors. > > I wonder if forcing to load modules comes with a CPU penalty compared > to > > prefork however. > > Can't tell. You still haven't showed how the modules are actually > being imported. > > Why aren't you just using apache.import_module() from mod_python? See: > > http://www.modpython.org/live/current/doc-html/pyapi-apmeth.html > > Graham > > > > -----Original Message----- > > > From: mod_python-bounces at modpython.org [mailto:mod_python- > > > bounces at modpython.org] On Behalf Of Alec Matusis > > > Sent: Tuesday, November 06, 2007 6:43 PM > > > To: 'Graham Dumpleton' > > > Cc: mod_python at modpython.org > > > Subject: RE: [mod_python] mod_python or apache scalability? > > > > > > > So, what is 'modules' and '_import' and how are they used? > > > > > > 'modules' is 'from sys import modules', so it's sys.modules... > > > > > > '_import' is used in the publisher like so: > > > > > > publisher.py: > > > > > > from common._import import _import > > > > > > def handler(req): > > > ... > > > dyn_path = req.uri[1:] #to get rid of '/' prefix > > > module_path = string.replace(dyn_path,'/','.') > > > runme = _import( module_path ) > > > .... > > > > > > > > > common/_import.py: > > > > > > from sys import modules > > > def _import( module_path ): > > > runme = modules[ module_path ] > > > #get the last part > > > class_name = module_path.split('.')[-1] > > > try: > > > runme = getattr(runme, class_name ) <--- INTERMITTENT ERROR > > > here > > > except AttributeError: > > > raise Exception,'Module does not contain class > '+class_name+', > > > runme='+repr(runme)+' module_path='+repr(module_path) > > > > > > > How do you > > > > ensure that two concurrently executing request handlers trying to > > > > import the same module don't interfere with each other? > > > > > > We do not insure that- the code was written for prefork MPM. > > > What is the best way to insure that? > > > > > > > > > > -----Original Message----- > > > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com] > > > > Sent: Tuesday, November 06, 2007 3:53 AM > > > > To: Alec Matusis > > > > Cc: Martijn Moeling; mod_python at modpython.org > > > > Subject: Re: [mod_python] mod_python or apache scalability? > > > > > > > > On 06/11/2007, Alec Matusis <matusis at yahoo.com> wrote: > > > > > I found that this that > > > > > > > > > > IOError: image file is truncated (14432 bytes not processed) > > > > > > > > > > has to do with using input filter > > > > > > > > > > PythonInputFilter flashfilter FLASHFILTER > > > > > SetInputFilter FLASHFILTER > > > > > > > > > > It somehow behaves differently in apache 2.2.6/mod_python 3.3.1 > > > > versus > > > > > 2.0.54/ 3.1.4. For now, I removed this input filter (we can get > > > away > > > > without > > > > > it), and it seems to work. > > > > > > > > > > However, we have a bigger problem: we now have intermittent > import > > > > errors > > > > > (one error per about 50-80 requests). > > > > > > > > > > This code executed by publisher: > > > > > > > > > > import modules > > > > > def _import( module_path ): > > > > > runme = modules[ module_path ] > > > > > #get the last part > > > > > class_name = module_path.split('.')[-1] > > > > > try: > > > > > runme = getattr(runme, class_name ) <--- INTERMITTENT > ERROR > > > > here > > > > > except AttributeError: > > > > > raise Exception,'Module does not contain class > > > > '+class_name+', > > > > > runme='+repr(runme)+' module_path='+repr(module_path) > > > > > > > > > > produces this error: > > > > > > > > > > Exception: Module does not contain class getprofile, > runme=<module > > > > > 'getprofile' from '/path/scripts/getprofile.py'> > > > > module_path='getprofile' > > > > > > > > > > This used to always work with prefork. Is there anything not > thread > > > > safe > > > > > about importing modules? > > > > > > > > The underlying Python and mod_python import mechanisms themselves > are > > > > never usually the problem. It is thread unsafe practices > regarding > > > > initialisation in the code around the imports. > > > > > > > > So, if you have built your own form of module importer or higher > > > level > > > > layer for one and you haven't properly thread protected it, it is > > > most > > > > likely the cause. > > > > > > > > So, what is 'modules' and '_import' and how are they used? How do > you > > > > ensure that two concurrently executing request handlers trying to > > > > import the same module don't interfere with each other? > > > > > > > > Graham > > > > > > > > > > -----Original Message----- > > > > > > From: mod_python-bounces at modpython.org [mailto:mod_python- > > > > > > bounces at modpython.org] On Behalf Of Graham Dumpleton > > > > > > Sent: Monday, November 05, 2007 1:29 PM > > > > > > To: Martijn Moeling > > > > > > Cc: mod_python at modpython.org; Alec Matusis > > > > > > Subject: Re: [mod_python] mod_python or apache scalability? > > > > > > > > > > > > The LimitRequestBody directive support in mod_python is > broken > > > and > > > > > > most likely would have resulted in a 500 error occurring > during > > > > > > mod_python.publisher processing of the form even before the > user > > > > code > > > > > > was run. > > > > > > > > > > > > My guess at the reason is that they have enabled in the new > > > Apache > > > > > > server compression on request content. This may not work with > > > > > > mod_python because how it reads request content is broken, > with > > > it > > > > > > only reading up until original Content-Length in certain > cases. > > > If > > > > > > there is a mutating input filter such as content > decompression, > > > you > > > > > > will get truncated input data. > > > > > > > > > > > > https://issues.apache.org/jira/browse/MODPYTHON-240 > > > > > > https://issues.apache.org/jira/browse/MODPYTHON-212 > > > > > > > > > > > > So, OP should ensure that directives to accept compressed > request > > > > > > content are disabled. > > > > > > > > > > > > Graham > > > > > > > > > > > > On 05/11/2007, Martijn Moeling <martijn at xs4us.nu> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hmm it looks to me that you have an upload limit in your > apache > > > > > > config. > > > > > > > > > > > > > > It might be the LimitRequestBody directive. > > > > > > > > > > > > > > Another thing might be that your processing is faster than > the > > > > data > > > > > > comes > > > > > > > in, but I doubt that > > > > > > > Take a close look at you apache conf, and your apache error > log > > > > > > (right after > > > > > > > an upload or with tail -f /var/log/httpd/error_log (or > where > > > ever > > > > > > your > > > > > > > logdir is)). > > > > > > > > > > > > > > Martijn > > > > > > > > > > > > > > > > > > > > > ________________________________ > > > > > > > Van: Alec Matusis [mailto:matusis at matusis.com] > > > > > > > Verzonden: zo 04.11.2007 21:46 > > > > > > > Aan: Martijn Moeling > > > > > > > CC: mod_python at modpython.org > > > > > > > Onderwerp: RE: [mod_python] mod_python or apache > scalability? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I had similair problems however, they turned out to be > > > MySQLdb > > > > > > > > related and not in a way you suspect > > > > > > > > > > > > > > Thanks! I will try separating MySQLdb connections. We do > pass > > > the > > > > > > connection > > > > > > > around currently. > > > > > > > > > > > > > > However, how would you explain this PIL error: > > > > > > > > > > > > > > File "/path/publisher/publisher.py", line 78, in > handler > > > > > > > flow_instance.dispatch() > > > > > > > File "/path/scripts/updateprofile.py", line 92, in > > > > > > > dispatch > > > > > > > im.save(full, self.im_format) > > > > > > > File > > > > > > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site- > > > > > > packages/PIL/Image.py", > > > > > > > line 1272, in save > > > > > > > > > > > > > > File > > > > > > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site- > > > > > > packages/PIL/ImageFile.py" > > > > > > > , line 192, in load > > > > > > > IOError: image file is truncated (14432 bytes not > processed) > > > > > > > > > > > > > > This only happened for large images, small ones uploaded > fine. > > > > > > > The code is > > > > > > > > > > > > > > def dispatch(self): > > > > > > > pix = self.form['Filedata'] > > > > > > > try: > > > > > > > im = Image.open(pix.file) > > > > > > > except IOError: > > > > > > > self.req.status = apache.HTTP_NOT_ACCEPTABLE > > > > > > > im.save(full, self.im_format) > > > > > > > > > > > > > > The error is in the last line. It did not occur with > prefork. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > From: mod_python-bounces at modpython.org > > > > > > > [mailto:mod_python- > > > > > > > > bounces at modpython.org] On Behalf Of Martijn Moeling > > > > > > > > Sent: Sunday, November 04, 2007 5:16 AM > > > > > > > > To: Graham Dumpleton > > > > > > > > Cc: mod_python at modpython.org; Alec Matusis > > > > > > > > > > > > > > > Subject: Re: [mod_python] mod_python or apache > scalability? > > > > > > > > > > > > > > > > hi, > > > > > > > > > > > > > > > > I use Pil without any problems, a I do with MySQLdb. > > > > > > > > > > > > > > > > I had similair problems however, they turned out to be > > > MySQLdb > > > > > > > > related and not in a way you suspect > > > > > > > > > > > > > > > > Never ever use a global database connection variable. > Create > > > > the > > > > > > > > database connection within your handler, > > > > > > > > end register a cleanup for closing it. I fixed a lot of > > > trouble > > > > by > > > > > > > > altering my system by making all "global" stuff a method > or > > > > > > property > > > > > > > > of the req object. > > > > > > > > > > > > > > > > The cleanup procedure cleans all objects which do > anything to > > > > 3rd > > > > > > > > party external (net connected) stuff (like imap, MySQL) > > > > > > > > > > > > > > > > > > > > > > > > think of something like this: > > > > > > > > > > > > > > > > > > > > > > > > def handler_cleanup(req): > > > > > > > > req.db.cursor.close() > > > > > > > > req.db.close() > > > > > > > > > > > > > > > > Def handler (req): > > > > > > > > req.db=MySQLdb.connection(...........) > > > > > > > > req.cursor=req.db.cursor(...) > > > > > > > > > > > > > > > > .... > > > > > > > > .... > > > > > > > > .... > > > > > > > > req.register_cleanup(handler_cleanup,req) > > > > > > > > return apache.OK > > > > > > > > > > > > > > > > All my application specific code are classes, and > instantated > > > > as a > > > > > > > > req.something and req is passed to all functions, which > is > > > neat > > > > > > since > > > > > > > > it is passed by reference meaning stacks stay small to > > > > > > > > > > > > > > > > If you use MySQLdb with mod_python use the above method, > my > > > > > > > > production server is rock stable and handled over 1mln > > > requests > > > > > > since > > > > > > > > last december without any reboots or problems. > > > > > > > > > > > > > > > > Martijn > > > > > > > > > > > > > > > > On Nov 4, 2007, at 11:48 AM, Graham Dumpleton wrote: > > > > > > > > > > > > > > > > > Hmmm, I do remember vaguely hearing questions about PIL > > > > thread > > > > > > safety > > > > > > > > > before, so it might be an issue. :-( > > > > > > > > > > > > > > > > > > Graham > > > > > > > > > > > > > > > > > > On 04/11/2007, Alec Matusis <matusis at matusis.com> > wrote: > > > > > > > > >>> BTW, are you still using mod_python 3.1.4? > > > > > > > > >> > > > > > > > > >> No, we went to apache 2.2.6/ mod_python 3.3.1 > combination. > > > > > > > > >> > > > > > > > > >> I will get MySQLdb to work thread-safely, but with PIL > I > > > am > > > > not > > > > > > so > > > > > > > > >> optimistic... > > > > > > > > >> > > > > > > > > >>> -----Original Message----- > > > > > > > > >>> From: Graham Dumpleton > > > > > > > [mailto:graham.dumpleton at gmail.com] > > > > > > > > >>> Sent: Sunday, November 04, 2007 2:33 AM > > > > > > > > >>> To: Alec Matusis > > > > > > > > >>> Cc: mod_python at modpython.org > > > > > > > > >>> Subject: Re: [mod_python] mod_python or apache > > > scalability? > > > > > > > > >>> > > > > > > > > >>> BTW, are you still using mod_python 3.1.4? Older > > > mod_python > > > > > > > > versions > > > > > > > > >>> have various bugs and you would be much better > upgrading > > > to > > > > > > 3.3.1 > > > > > > > > if > > > > > > > > >>> you haven't already. > > > > > > > > >>> > > > > > > > > >>> Graham > > > > > > > > >>> > > > > > > > > >>> On 04/11/2007, Alec Matusis <matusis at matusis.com> > wrote: > > > > > > > > >>>>> Which indicates that some third party package you > are > > > > using > > > > > > is > > > > > > > > not > > > > > > > > >>>>> thread safe. This can occur if you are also using > PHP > > > as > > > > > > > > >>>>> various of > > > > > > > > >>>>> its third party packages are not thread safe. > > > > > > > > >>>> > > > > > > > > >>>> We are not using PHP, only python with mod_python. > > > > > > > > >>>> The only third party packages we use are MySQLdb > adapter > > > > and > > > > > > PIL > > > > > > > > >>> image > > > > > > > > >>>> library. > > > > > > > > >>>> > > > > > > > > >>>> So I guess both PIL and MySQLdb have these problems. > > > > > > > > >>>> > > > > > > > > >>>> > > > > > > > > >>>>> -----Original Message----- > > > > > > > > >>>>> From: Graham Dumpleton > > > > > > > [mailto:graham.dumpleton at gmail.com] > > > > > > > > >>>>> Sent: Sunday, November 04, 2007 2:15 AM > > > > > > > > >>>>> To: Alec Matusis > > > > > > > > >>>>> Cc: mod_python at modpython.org > > > > > > > > >>>>> Subject: Re: [mod_python] mod_python or apache > > > > scalability? > > > > > > > > >>>>> > > > > > > > > >>>>> On 04/11/2007, Alec Matusis <matusis at matusis.com> > > > wrote: > > > > > > > > >>>>>>> FWIW, I personally would try and move from > prefork to > > > > > > worker > > > > > > > > >>> MPM as > > > > > > > > >>>>>>> the number of Apache child processes you are > running > > > > with > > > > > > is to > > > > > > > > >>> my > > > > > > > > >>>>>>> mind excessive. Using worker would certainly drop > > > > memory > > > > > > usage > > > > > > > > >>> for > > > > > > > > >>>>> a > > > > > > > > >>>>>>> start as you wouldn't need as many child > processes to > > > > be > > > > > > > > >>> started. > > > > > > > > >>>>>> > > > > > > > > >>>>>> I am just following up on this, since we tried > worker > > > > MPM > > > > > > this > > > > > > > > >>>>> weekend. > > > > > > > > >>>>>> On our dev/stage it worked perfectly. > > > > > > > > >>>>>> On live, worker MPM freed up about 2GB of memory > > > > compared to > > > > > > > > >>> prefork. > > > > > > > > >>>>>> However, on live, it turned out to be unstable. > > > > > > > > >>>>>> This is what we say in the main error log: > > > > > > > > >>>>>> > > > > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid > 20347 > > > exit > > > > > > signal > > > > > > > > >>>>> Segmentation > > > > > > > > >>>>>> fault (11) > > > > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid > 20460 > > > exit > > > > > > signal > > > > > > > > >>>>> Aborted (6) > > > > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid > 20515 > > > exit > > > > > > signal > > > > > > > > >>>>> Segmentation > > > > > > > > >>>>>> fault (11) > > > > > > > > >>>>>> *** glibc detected *** double free or corruption > > > > (!prev): > > > > > > > > >>>>> 0x0000000000762c50 > > > > > > > > >>>>>> *** > > > > > > > > >>>>>> *** glibc detected *** double free or corruption > > > > (!prev): > > > > > > > > >>>>> 0x000000000075d100 > > > > > > > > >>>>>> *** > > > > > > > > >>>>>> [Sun Nov 04 01:37:46 2007] [notice] child pid > 20152 > > > exit > > > > > > signal > > > > > > > > >>>>> Segmentation > > > > > > > > >>>>>> fault (11) > > > > > > > > >>>>>> > > > > > > > > >>>>>> In the application log, we saw two types of > errors: > > > > > > > > >>>>>> > > > > > > > > >>>>>> MySQLError: Connection to database failed > > > > > > > > >>>>>> OperationalError: (2006, 'MySQL server has gone > away') > > > > > > > > >>>>>> > > > > > > > > >>>>>> and > > > > > > > > >>>>>> > > > > > > > > >>>>>> IOError: image file is truncated (44 bytes not > > > > processed) > > > > > > > > >>>>>> > > > > > > > > >>>>>> The first type has to do with MySQLdb module, but > the > > > > second > > > > > > one > > > > > > > > >>>>> occurred > > > > > > > > >>>>>> when large images were uploaded. > > > > > > > > >>>>> > > > > > > > > >>>>> Which indicates that some third party package you > are > > > > using > > > > > > is > > > > > > > > not > > > > > > > > >>>>> thread safe. This can occur if you are also using > PHP > > > as > > > > > > > > >>>>> various of > > > > > > > > >>>>> its third party packages are not thread safe. Also > > > ensure > > > > > > that > > > > > > > > you > > > > > > > > >>> are > > > > > > > > >>>>> using the latest available Python database adapters > and > > > > that > > > > > > they > > > > > > > > >>> are > > > > > > > > >>>>> compiled against thread safe reentrant libraries. > > > > > > > > >>>>> > > > > > > > > >>>>> Graham > > > > > > > > >>>>> > > > > > > > > >>>>>> We had to revert to prefork as a result of this. > > > > > > > > >>>>>> > > > > > > > > >>>>>> On another note, I managed to empirically find the > > > > maximum > > > > > > > > >>>>> ServerLimit for > > > > > > > > >>>>>> prefork, before the machine dies from swapping. > > > > > > > > >>>>>> It is 380 with 4GB RAM. > > > > > > > > >>>>>> > > > > > > > > >>>>>>> -----Original Message----- > > > > > > > > >>>>>>> From: Graham Dumpleton > > > > > > > [mailto:graham.dumpleton at gmail.com] > > > > > > > > >>>>>>> Sent: Monday, October 01, 2007 6:47 PM > > > > > > > > >>>>>>> To: Alec Matusis > > > > > > > > >>>>>>> Cc: mod_python at modpython.org > > > > > > > > >>>>>>> Subject: Re: [mod_python] mod_python or apache > > > > scalability? > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> On 01/10/2007, Alec Matusis <matusis at matusis.com> > > > > wrote: > > > > > > > > >>>>>>>> in the apache error log. We also got > > > > > > > > >>>>>>>> > > > > > > > > >>>>>>>> kernel: possible SYN flooding on port 80. > Sending > > > > cookies. > > > > > > > > >>>>>>>> > > > > > > > > >>>>>>>> in /var/log/messages system log. > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> Have you determined for certain that you aren't > the > > > > target > > > > > > of > > > > > > > > >>> an > > > > > > > > >>>>>>> external SYN Flood DOS attack? > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> Do a Google search for 'kernel: possible SYN > flooding > > > > on > > > > > > port > > > > > > > > >>> 80. > > > > > > > > >>>>>>> Sending cookies' and you will find lots of stuff > to > > > > read. > > > > > > Your > > > > > > > > >>>>> running > > > > > > > > >>>>>>> out of or having a large number of socket > connections > > > > may > > > > > > be > > > > > > > > >>>>>>> symptomatic of a large number of half open > > > connections > > > > > > being > > > > > > > > >>>>> created > > > > > > > > >>>>>>> and then being left in TIME_WAIT. Thus perhaps do > > > some > > > > > > better > > > > > > > > >>>>> analysis > > > > > > > > >>>>>>> of socket connection states using netstat. If not > a > > > SYN > > > > > > Flood, > > > > > > > > >>> then > > > > > > > > >>>>>>> possibly follow some of the other suggestions in > the > > > > pages > > > > > > you > > > > > > > > >>> will > > > > > > > > >>>>>>> find when you do the search. > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> FWIW, I personally would try and move from > prefork to > > > > > > worker > > > > > > > > >>> MPM as > > > > > > > > >>>>>>> the number of Apache child processes you are > running > > > > with > > > > > > is to > > > > > > > > >>> my > > > > > > > > >>>>>>> mind excessive. Using worker would certainly drop > > > > memory > > > > > > usage > > > > > > > > >>> for > > > > > > > > >>>>> a > > > > > > > > >>>>>>> start as you wouldn't need as many child > processes to > > > > be > > > > > > > > >>> started. I > > > > > > > > >>>>>>> wouldn't be concerned about running out of > threads as > > > > when > > > > > > > > >>> running > > > > > > > > >>>>>>> worker I wouldn't suggest more than 25 threads > per > > > > process > > > > > > as a > > > > > > > > >>>>>>> starting point anyway. If your mod_python > application > > > > was > > > > > > > > >>> creating > > > > > > > > >>>>>>> lots of threads, you are likely to hit the thread > > > limit > > > > > > with > > > > > > > > >>>>> prefork > > > > > > > > >>>>>>> and not just worker so which MPM is used > shouldn't be > > > > an > > > > > > issue > > > > > > > > >>> in > > > > > > > > >>>>> that > > > > > > > > >>>>>>> case. > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> BTW, what operating system are you using? > > > > > > > > >>>>>>> > > > > > > > > >>>>>>> Graham > > > > > > > > >>>>>> > > > > > > > > >>>>>> > > > > > > > > >>>> > > > > > > > > >>>> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > _______________________________________________ > > > > > > > > > Mod_python mailing list > > > > > > > > > Mod_python at modpython.org > > > > > > > > > > > > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > Mod_python mailing list > > > > > > > > Mod_python at modpython.org > > > > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Mod_python mailing list > > > > > > > Mod_python at modpython.org > > > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Mod_python mailing list > > > > > > Mod_python at modpython.org > > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python > > > > > > > > > > > > > > > > _______________________________________________ > > > Mod_python mailing list > > > Mod_python at modpython.org > > > http://mailman.modpython.org/mailman/listinfo/mod_python > > > >
|