[mod_python] mod_python or apache scalability?

Alec Matusis matusis at yahoo.com
Tue Nov 6 22:40:24 EST 2007


After I wrote this, I realized that the problem was in 

from sys import modules
def _import( module_path ):
     runme = modules[ module_path ] 

sys.modules was trying to use already loaded modules.
I got rid of it, and forced it to actually load modules every time, which
got rid of errors.
I wonder if forcing to load modules comes with a CPU penalty compared to
prefork however.

> -----Original Message-----
> From: mod_python-bounces at modpython.org [mailto:mod_python-
> bounces at modpython.org] On Behalf Of Alec Matusis
> Sent: Tuesday, November 06, 2007 6:43 PM
> To: 'Graham Dumpleton'
> Cc: mod_python at modpython.org
> Subject: RE: [mod_python] mod_python or apache scalability?
> 
> > So, what is 'modules' and '_import' and how are they used?
> 
> 'modules' is 'from sys import modules', so it's sys.modules...
> 
> '_import' is used in the publisher like so:
> 
> publisher.py:
> 
> from common._import import _import
> 
> def handler(req):
>     ...
>     dyn_path = req.uri[1:] #to get rid of '/' prefix
>     module_path = string.replace(dyn_path,'/','.')
>     runme = _import( module_path )
>     ....
> 
> 
> common/_import.py:
> 
> from sys import modules
> def _import( module_path ):
>     runme = modules[ module_path ]
>     #get the last part
>     class_name = module_path.split('.')[-1]
>     try:
>         runme = getattr(runme, class_name ) <--- INTERMITTENT ERROR
> here
>     except AttributeError:
>         raise Exception,'Module does not contain class '+class_name+',
> runme='+repr(runme)+' module_path='+repr(module_path)
> 
> > How do you
> > ensure that two concurrently executing request handlers trying to
> > import the same module don't interfere with each other?
> 
> We do not insure that- the code was written for prefork MPM.
> What is the best way to insure that?
> 
> 
> > -----Original Message-----
> > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > Sent: Tuesday, November 06, 2007 3:53 AM
> > To: Alec Matusis
> > Cc: Martijn Moeling; mod_python at modpython.org
> > Subject: Re: [mod_python] mod_python or apache scalability?
> >
> > On 06/11/2007, Alec Matusis <matusis at yahoo.com> wrote:
> > > I found that this that
> > >
> > > IOError: image file is truncated (14432 bytes not processed)
> > >
> > > has to do with using input filter
> > >
> > > PythonInputFilter flashfilter FLASHFILTER
> > > SetInputFilter FLASHFILTER
> > >
> > > It somehow behaves differently in apache 2.2.6/mod_python 3.3.1
> > versus
> > > 2.0.54/ 3.1.4. For now, I removed this input filter (we can get
> away
> > without
> > > it), and it seems to work.
> > >
> > > However, we have a bigger problem: we now have intermittent import
> > errors
> > > (one error per about 50-80 requests).
> > >
> > > This code executed by publisher:
> > >
> > > import modules
> > > def _import( module_path ):
> > >     runme = modules[ module_path ]
> > >     #get the last part
> > >     class_name = module_path.split('.')[-1]
> > >     try:
> > >         runme = getattr(runme, class_name ) <--- INTERMITTENT ERROR
> > here
> > >     except AttributeError:
> > >         raise Exception,'Module does not contain class
> > '+class_name+',
> > > runme='+repr(runme)+' module_path='+repr(module_path)
> > >
> > > produces this error:
> > >
> > > Exception: Module does not contain class getprofile, runme=<module
> > > 'getprofile' from '/path/scripts/getprofile.py'>
> > module_path='getprofile'
> > >
> > > This used to always work with prefork. Is there anything not thread
> > safe
> > > about importing modules?
> >
> > The underlying Python and mod_python import mechanisms themselves are
> > never usually the problem. It is thread unsafe practices regarding
> > initialisation in the code around the imports.
> >
> > So, if you have built your own form of module importer or higher
> level
> > layer for one and you haven't properly thread protected it, it is
> most
> > likely the cause.
> >
> > So, what is 'modules' and '_import' and how are they used? How do you
> > ensure that two concurrently executing request handlers trying to
> > import the same module don't interfere with each other?
> >
> > Graham
> >
> > > > -----Original Message-----
> > > > From: mod_python-bounces at modpython.org [mailto:mod_python-
> > > > bounces at modpython.org] On Behalf Of Graham Dumpleton
> > > > Sent: Monday, November 05, 2007 1:29 PM
> > > > To: Martijn Moeling
> > > > Cc: mod_python at modpython.org; Alec Matusis
> > > > Subject: Re: [mod_python] mod_python or apache scalability?
> > > >
> > > > The LimitRequestBody directive support in mod_python is broken
> and
> > > > most likely would have resulted in a 500 error occurring during
> > > > mod_python.publisher processing of the form even before the user
> > code
> > > > was run.
> > > >
> > > > My guess at the reason is that they have enabled in the new
> Apache
> > > > server compression on request content. This may not work with
> > > > mod_python because how it reads request content is broken, with
> it
> > > > only reading up until original Content-Length in certain cases.
> If
> > > > there is a mutating input filter such as content decompression,
> you
> > > > will get truncated input data.
> > > >
> > > > https://issues.apache.org/jira/browse/MODPYTHON-240
> > > > https://issues.apache.org/jira/browse/MODPYTHON-212
> > > >
> > > > So, OP should ensure that directives to accept compressed request
> > > > content are disabled.
> > > >
> > > > Graham
> > > >
> > > > On 05/11/2007, Martijn Moeling <martijn at xs4us.nu> wrote:
> > > > >
> > > > >
> > > > >
> > > > > Hmm it looks to me that you have an upload limit in your apache
> > > > config.
> > > > >
> > > > > It might be the LimitRequestBody directive.
> > > > >
> > > > > Another thing might be that your processing is faster than the
> > data
> > > > comes
> > > > > in, but I doubt that
> > > > > Take a close look at you apache conf, and your apache error log
> > > > (right after
> > > > > an upload or with tail -f /var/log/httpd/error_log (or where
> ever
> > > > your
> > > > > logdir is)).
> > > > >
> > > > > Martijn
> > > > >
> > > > >
> > > > >  ________________________________
> > > > >  Van: Alec Matusis [mailto:matusis at matusis.com]
> > > > > Verzonden: zo 04.11.2007 21:46
> > > > > Aan: Martijn Moeling
> > > > > CC: mod_python at modpython.org
> > > > > Onderwerp: RE: [mod_python] mod_python or apache scalability?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > I had similair problems however, they turned out to be
> MySQLdb
> > > > > > related and not in a way you suspect
> > > > >
> > > > > Thanks! I will try separating MySQLdb connections. We do pass
> the
> > > > connection
> > > > > around currently.
> > > > >
> > > > > However, how would you explain this PIL error:
> > > > >
> > > > >     File "/path/publisher/publisher.py", line 78, in handler
> > > > >  flow_instance.dispatch()
> > > > >  File "/path/scripts/updateprofile.py", line 92, in
> > > > > dispatch
> > > > >  im.save(full, self.im_format)
> > > > >     File
> > > > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site-
> > > > packages/PIL/Image.py",
> > > > > line 1272, in save
> > > > >
> > > > >  File
> > > > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site-
> > > > packages/PIL/ImageFile.py"
> > > > > , line 192, in load
> > > > >  IOError: image file is truncated (14432 bytes not processed)
> > > > >
> > > > > This only happened for large images, small ones uploaded fine.
> > > > > The code is
> > > > >
> > > > > def dispatch(self):
> > > > >     pix = self.form['Filedata']
> > > > >     try:
> > > > >         im = Image.open(pix.file)
> > > > >     except IOError:
> > > > >         self.req.status = apache.HTTP_NOT_ACCEPTABLE
> > > > >     im.save(full, self.im_format)
> > > > >
> > > > > The error is in the last line. It did not occur with prefork.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: mod_python-bounces at modpython.org
> > > > > [mailto:mod_python-
> > > > > > bounces at modpython.org] On Behalf Of Martijn Moeling
> > > > > > Sent: Sunday, November 04, 2007 5:16 AM
> > > > > > To: Graham Dumpleton
> > > > > > Cc: mod_python at modpython.org; Alec Matusis
> > > > >
> > > > > > Subject: Re: [mod_python] mod_python or apache scalability?
> > > > > >
> > > > > > hi,
> > > > > >
> > > > > > I use Pil without any problems, a I do with MySQLdb.
> > > > > >
> > > > > > I had similair problems however, they turned out to be
> MySQLdb
> > > > > > related and not in a way you suspect
> > > > > >
> > > > > > Never ever use a global database connection variable. Create
> > the
> > > > > > database connection within your handler,
> > > > > > end register a cleanup for closing it. I fixed a lot of
> trouble
> > by
> > > > > > altering my system by making all "global" stuff a method or
> > > > property
> > > > > > of the req object.
> > > > > >
> > > > > > The cleanup procedure cleans all objects which do anything to
> > 3rd
> > > > > > party external (net connected) stuff (like imap, MySQL)
> > > > > >
> > > > > >
> > > > > > think of something like this:
> > > > > >
> > > > > >
> > > > > > def handler_cleanup(req):
> > > > > >       req.db.cursor.close()
> > > > > >       req.db.close()
> > > > > >
> > > > > > Def handler (req):
> > > > > >       req.db=MySQLdb.connection(...........)
> > > > > >       req.cursor=req.db.cursor(...)
> > > > > >
> > > > > >       ....
> > > > > >       ....
> > > > > >       ....
> > > > > >       req.register_cleanup(handler_cleanup,req)
> > > > > >       return apache.OK
> > > > > >
> > > > > > All my application specific code are classes, and instantated
> > as a
> > > > > > req.something and req is passed to all functions, which is
> neat
> > > > since
> > > > > > it is passed by reference meaning stacks stay small to
> > > > > >
> > > > > > If you use MySQLdb with mod_python use the above method, my
> > > > > > production server is rock stable and handled over 1mln
> requests
> > > > since
> > > > > > last december without any reboots or problems.
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > > On Nov 4, 2007, at 11:48 AM, Graham Dumpleton wrote:
> > > > > >
> > > > > > > Hmmm, I do remember vaguely hearing questions about PIL
> > thread
> > > > safety
> > > > > > > before, so it might be an issue. :-(
> > > > > > >
> > > > > > > Graham
> > > > > > >
> > > > > > > On 04/11/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > > > >>> BTW, are you still using mod_python 3.1.4?
> > > > > > >>
> > > > > > >> No, we went to apache 2.2.6/ mod_python 3.3.1 combination.
> > > > > > >>
> > > > > > >> I will get MySQLdb to work thread-safely, but with PIL I
> am
> > not
> > > > so
> > > > > > >> optimistic...
> > > > > > >>
> > > > > > >>> -----Original Message-----
> > > > > > >>> From: Graham Dumpleton
> > > > > [mailto:graham.dumpleton at gmail.com]
> > > > > > >>> Sent: Sunday, November 04, 2007 2:33 AM
> > > > > > >>> To: Alec Matusis
> > > > > > >>> Cc: mod_python at modpython.org
> > > > > > >>> Subject: Re: [mod_python] mod_python or apache
> scalability?
> > > > > > >>>
> > > > > > >>> BTW, are you still using mod_python 3.1.4? Older
> mod_python
> > > > > > versions
> > > > > > >>> have various bugs and you would be much better upgrading
> to
> > > > 3.3.1
> > > > > > if
> > > > > > >>> you haven't already.
> > > > > > >>>
> > > > > > >>> Graham
> > > > > > >>>
> > > > > > >>> On 04/11/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > > > >>>>> Which indicates that some third party package you are
> > using
> > > > is
> > > > > > not
> > > > > > >>>>> thread safe. This can occur if you are also using PHP
> as
> > > > > > >>>>> various of
> > > > > > >>>>> its third party packages are not thread safe.
> > > > > > >>>>
> > > > > > >>>> We are not using PHP, only python with mod_python.
> > > > > > >>>> The only third party packages we use are MySQLdb adapter
> > and
> > > > PIL
> > > > > > >>> image
> > > > > > >>>> library.
> > > > > > >>>>
> > > > > > >>>> So I guess both PIL and MySQLdb have these problems.
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>> -----Original Message-----
> > > > > > >>>>> From: Graham Dumpleton
> > > > > [mailto:graham.dumpleton at gmail.com]
> > > > > > >>>>> Sent: Sunday, November 04, 2007 2:15 AM
> > > > > > >>>>> To: Alec Matusis
> > > > > > >>>>> Cc: mod_python at modpython.org
> > > > > > >>>>> Subject: Re: [mod_python] mod_python or apache
> > scalability?
> > > > > > >>>>>
> > > > > > >>>>> On 04/11/2007, Alec Matusis <matusis at matusis.com>
> wrote:
> > > > > > >>>>>>> FWIW, I personally would try and move from prefork to
> > > > worker
> > > > > > >>> MPM as
> > > > > > >>>>>>> the number of Apache child processes you are running
> > with
> > > > is to
> > > > > > >>> my
> > > > > > >>>>>>> mind excessive. Using worker would certainly drop
> > memory
> > > > usage
> > > > > > >>> for
> > > > > > >>>>> a
> > > > > > >>>>>>> start as you wouldn't need as many child processes to
> > be
> > > > > > >>> started.
> > > > > > >>>>>>
> > > > > > >>>>>> I am just following up on this, since we tried worker
> > MPM
> > > > this
> > > > > > >>>>> weekend.
> > > > > > >>>>>> On our dev/stage it worked perfectly.
> > > > > > >>>>>> On live, worker MPM freed up about 2GB of memory
> > compared to
> > > > > > >>> prefork.
> > > > > > >>>>>> However, on live, it turned out to be unstable.
> > > > > > >>>>>> This is what we say in the main error log:
> > > > > > >>>>>>
> > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20347
> exit
> > > > signal
> > > > > > >>>>> Segmentation
> > > > > > >>>>>> fault (11)
> > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20460
> exit
> > > > signal
> > > > > > >>>>> Aborted (6)
> > > > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20515
> exit
> > > > signal
> > > > > > >>>>> Segmentation
> > > > > > >>>>>> fault (11)
> > > > > > >>>>>> *** glibc detected *** double free or corruption
> > (!prev):
> > > > > > >>>>> 0x0000000000762c50
> > > > > > >>>>>> ***
> > > > > > >>>>>> *** glibc detected *** double free or corruption
> > (!prev):
> > > > > > >>>>> 0x000000000075d100
> > > > > > >>>>>> ***
> > > > > > >>>>>> [Sun Nov 04 01:37:46 2007] [notice] child pid 20152
> exit
> > > > signal
> > > > > > >>>>> Segmentation
> > > > > > >>>>>> fault (11)
> > > > > > >>>>>>
> > > > > > >>>>>> In the application log, we saw two types of errors:
> > > > > > >>>>>>
> > > > > > >>>>>> MySQLError: Connection to database failed
> > > > > > >>>>>> OperationalError: (2006, 'MySQL server has gone away')
> > > > > > >>>>>>
> > > > > > >>>>>> and
> > > > > > >>>>>>
> > > > > > >>>>>> IOError: image file is truncated (44 bytes not
> > processed)
> > > > > > >>>>>>
> > > > > > >>>>>> The first type has to do with MySQLdb module, but the
> > second
> > > > one
> > > > > > >>>>> occurred
> > > > > > >>>>>> when large images were uploaded.
> > > > > > >>>>>
> > > > > > >>>>> Which indicates that some third party package you are
> > using
> > > > is
> > > > > > not
> > > > > > >>>>> thread safe. This can occur if you are also using PHP
> as
> > > > > > >>>>> various of
> > > > > > >>>>> its third party packages are not thread safe. Also
> ensure
> > > > that
> > > > > > you
> > > > > > >>> are
> > > > > > >>>>> using the latest available Python database adapters and
> > that
> > > > they
> > > > > > >>> are
> > > > > > >>>>> compiled against thread safe reentrant libraries.
> > > > > > >>>>>
> > > > > > >>>>> Graham
> > > > > > >>>>>
> > > > > > >>>>>> We had to revert to prefork as a result of this.
> > > > > > >>>>>>
> > > > > > >>>>>> On another note, I managed to empirically find the
> > maximum
> > > > > > >>>>> ServerLimit for
> > > > > > >>>>>> prefork, before the machine dies from swapping.
> > > > > > >>>>>> It is 380 with 4GB RAM.
> > > > > > >>>>>>
> > > > > > >>>>>>> -----Original Message-----
> > > > > > >>>>>>> From: Graham Dumpleton
> > > > > [mailto:graham.dumpleton at gmail.com]
> > > > > > >>>>>>> Sent: Monday, October 01, 2007 6:47 PM
> > > > > > >>>>>>> To: Alec Matusis
> > > > > > >>>>>>> Cc: mod_python at modpython.org
> > > > > > >>>>>>> Subject: Re: [mod_python] mod_python or apache
> > scalability?
> > > > > > >>>>>>>
> > > > > > >>>>>>> On 01/10/2007, Alec Matusis <matusis at matusis.com>
> > wrote:
> > > > > > >>>>>>>> in the apache error log. We also got
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> kernel: possible SYN flooding on port 80. Sending
> > cookies.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> in /var/log/messages system log.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Have you determined for certain that you aren't the
> > target
> > > > of
> > > > > > >>> an
> > > > > > >>>>>>> external SYN Flood DOS attack?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Do a Google search for 'kernel: possible SYN flooding
> > on
> > > > port
> > > > > > >>> 80.
> > > > > > >>>>>>> Sending cookies' and you will find lots of stuff to
> > read.
> > > > Your
> > > > > > >>>>> running
> > > > > > >>>>>>> out of or having a large number of socket connections
> > may
> > > > be
> > > > > > >>>>>>> symptomatic of a large number of half open
> connections
> > > > being
> > > > > > >>>>> created
> > > > > > >>>>>>> and then being left in TIME_WAIT. Thus perhaps do
> some
> > > > better
> > > > > > >>>>> analysis
> > > > > > >>>>>>> of socket connection states using netstat. If not a
> SYN
> > > > Flood,
> > > > > > >>> then
> > > > > > >>>>>>> possibly follow some of the other suggestions in the
> > pages
> > > > you
> > > > > > >>> will
> > > > > > >>>>>>> find when you do the search.
> > > > > > >>>>>>>
> > > > > > >>>>>>> FWIW, I personally would try and move from prefork to
> > > > worker
> > > > > > >>> MPM as
> > > > > > >>>>>>> the number of Apache child processes you are running
> > with
> > > > is to
> > > > > > >>> my
> > > > > > >>>>>>> mind excessive. Using worker would certainly drop
> > memory
> > > > usage
> > > > > > >>> for
> > > > > > >>>>> a
> > > > > > >>>>>>> start as you wouldn't need as many child processes to
> > be
> > > > > > >>> started. I
> > > > > > >>>>>>> wouldn't be concerned about running out of threads as
> > when
> > > > > > >>> running
> > > > > > >>>>>>> worker I wouldn't suggest more than 25 threads per
> > process
> > > > as a
> > > > > > >>>>>>> starting point anyway. If your mod_python application
> > was
> > > > > > >>> creating
> > > > > > >>>>>>> lots of threads, you are likely to hit the thread
> limit
> > > > with
> > > > > > >>>>> prefork
> > > > > > >>>>>>> and not just worker so which MPM is used shouldn't be
> > an
> > > > issue
> > > > > > >>> in
> > > > > > >>>>> that
> > > > > > >>>>>>> case.
> > > > > > >>>>>>>
> > > > > > >>>>>>> BTW, what operating system are you using?
> > > > > > >>>>>>>
> > > > > > >>>>>>> Graham
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>
> > > > > > >>
> > > > > > > _______________________________________________
> > > > > > > Mod_python mailing list
> > > > > > > Mod_python at modpython.org
> > > > > > >
> > > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > > > >
> > > > > > _______________________________________________
> > > > > > Mod_python mailing list
> > > > > > Mod_python at modpython.org
> > > > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Mod_python mailing list
> > > > > Mod_python at modpython.org
> > > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > > >
> > > > >
> > > > _______________________________________________
> > > > Mod_python mailing list
> > > > Mod_python at modpython.org
> > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > >
> > >
> 
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python



More information about the Mod_python mailing list