[mod_python] mod_python or apache scalability?

Graham Dumpleton graham.dumpleton at gmail.com
Tue Nov 6 06:52:32 EST 2007


On 06/11/2007, Alec Matusis <matusis at yahoo.com> wrote:
> I found that this that
>
> IOError: image file is truncated (14432 bytes not processed)
>
> has to do with using input filter
>
> PythonInputFilter flashfilter FLASHFILTER
> SetInputFilter FLASHFILTER
>
> It somehow behaves differently in apache 2.2.6/mod_python 3.3.1 versus
> 2.0.54/ 3.1.4. For now, I removed this input filter (we can get away without
> it), and it seems to work.
>
> However, we have a bigger problem: we now have intermittent import errors
> (one error per about 50-80 requests).
>
> This code executed by publisher:
>
> import modules
> def _import( module_path ):
>     runme = modules[ module_path ]
>     #get the last part
>     class_name = module_path.split('.')[-1]
>     try:
>         runme = getattr(runme, class_name ) <--- INTERMITTENT ERROR here
>     except AttributeError:
>         raise Exception,'Module does not contain class '+class_name+',
> runme='+repr(runme)+' module_path='+repr(module_path)
>
> produces this error:
>
> Exception: Module does not contain class getprofile, runme=<module
> 'getprofile' from '/path/scripts/getprofile.py'> module_path='getprofile'
>
> This used to always work with prefork. Is there anything not thread safe
> about importing modules?

The underlying Python and mod_python import mechanisms themselves are
never usually the problem. It is thread unsafe practices regarding
initialisation in the code around the imports.

So, if you have built your own form of module importer or higher level
layer for one and you haven't properly thread protected it, it is most
likely the cause.

So, what is 'modules' and '_import' and how are they used? How do you
ensure that two concurrently executing request handlers trying to
import the same module don't interfere with each other?

Graham

> > -----Original Message-----
> > From: mod_python-bounces at modpython.org [mailto:mod_python-
> > bounces at modpython.org] On Behalf Of Graham Dumpleton
> > Sent: Monday, November 05, 2007 1:29 PM
> > To: Martijn Moeling
> > Cc: mod_python at modpython.org; Alec Matusis
> > Subject: Re: [mod_python] mod_python or apache scalability?
> >
> > The LimitRequestBody directive support in mod_python is broken and
> > most likely would have resulted in a 500 error occurring during
> > mod_python.publisher processing of the form even before the user code
> > was run.
> >
> > My guess at the reason is that they have enabled in the new Apache
> > server compression on request content. This may not work with
> > mod_python because how it reads request content is broken, with it
> > only reading up until original Content-Length in certain cases. If
> > there is a mutating input filter such as content decompression, you
> > will get truncated input data.
> >
> > https://issues.apache.org/jira/browse/MODPYTHON-240
> > https://issues.apache.org/jira/browse/MODPYTHON-212
> >
> > So, OP should ensure that directives to accept compressed request
> > content are disabled.
> >
> > Graham
> >
> > On 05/11/2007, Martijn Moeling <martijn at xs4us.nu> wrote:
> > >
> > >
> > >
> > > Hmm it looks to me that you have an upload limit in your apache
> > config.
> > >
> > > It might be the LimitRequestBody directive.
> > >
> > > Another thing might be that your processing is faster than the data
> > comes
> > > in, but I doubt that
> > > Take a close look at you apache conf, and your apache error log
> > (right after
> > > an upload or with tail -f /var/log/httpd/error_log (or where ever
> > your
> > > logdir is)).
> > >
> > > Martijn
> > >
> > >
> > >  ________________________________
> > >  Van: Alec Matusis [mailto:matusis at matusis.com]
> > > Verzonden: zo 04.11.2007 21:46
> > > Aan: Martijn Moeling
> > > CC: mod_python at modpython.org
> > > Onderwerp: RE: [mod_python] mod_python or apache scalability?
> > >
> > >
> > >
> > >
> > > > I had similair problems however, they turned out to be MySQLdb
> > > > related and not in a way you suspect
> > >
> > > Thanks! I will try separating MySQLdb connections. We do pass the
> > connection
> > > around currently.
> > >
> > > However, how would you explain this PIL error:
> > >
> > >     File "/path/publisher/publisher.py", line 78, in handler
> > >  flow_instance.dispatch()
> > >  File "/path/scripts/updateprofile.py", line 92, in
> > > dispatch
> > >  im.save(full, self.im_format)
> > >     File
> > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site-
> > packages/PIL/Image.py",
> > > line 1272, in save
> > >
> > >  File
> > > "/usr/local/encap/Python-2.4.1/lib/python2.4/site-
> > packages/PIL/ImageFile.py"
> > > , line 192, in load
> > >  IOError: image file is truncated (14432 bytes not processed)
> > >
> > > This only happened for large images, small ones uploaded fine.
> > > The code is
> > >
> > > def dispatch(self):
> > >     pix = self.form['Filedata']
> > >     try:
> > >         im = Image.open(pix.file)
> > >     except IOError:
> > >         self.req.status = apache.HTTP_NOT_ACCEPTABLE
> > >     im.save(full, self.im_format)
> > >
> > > The error is in the last line. It did not occur with prefork.
> > >
> > >
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: mod_python-bounces at modpython.org
> > > [mailto:mod_python-
> > > > bounces at modpython.org] On Behalf Of Martijn Moeling
> > > > Sent: Sunday, November 04, 2007 5:16 AM
> > > > To: Graham Dumpleton
> > > > Cc: mod_python at modpython.org; Alec Matusis
> > >
> > > > Subject: Re: [mod_python] mod_python or apache scalability?
> > > >
> > > > hi,
> > > >
> > > > I use Pil without any problems, a I do with MySQLdb.
> > > >
> > > > I had similair problems however, they turned out to be MySQLdb
> > > > related and not in a way you suspect
> > > >
> > > > Never ever use a global database connection variable. Create the
> > > > database connection within your handler,
> > > > end register a cleanup for closing it. I fixed a lot of trouble by
> > > > altering my system by making all "global" stuff a method or
> > property
> > > > of the req object.
> > > >
> > > > The cleanup procedure cleans all objects which do anything to 3rd
> > > > party external (net connected) stuff (like imap, MySQL)
> > > >
> > > >
> > > > think of something like this:
> > > >
> > > >
> > > > def handler_cleanup(req):
> > > >       req.db.cursor.close()
> > > >       req.db.close()
> > > >
> > > > Def handler (req):
> > > >       req.db=MySQLdb.connection(...........)
> > > >       req.cursor=req.db.cursor(...)
> > > >
> > > >       ....
> > > >       ....
> > > >       ....
> > > >       req.register_cleanup(handler_cleanup,req)
> > > >       return apache.OK
> > > >
> > > > All my application specific code are classes, and instantated as a
> > > > req.something and req is passed to all functions, which is neat
> > since
> > > > it is passed by reference meaning stacks stay small to
> > > >
> > > > If you use MySQLdb with mod_python use the above method, my
> > > > production server is rock stable and handled over 1mln requests
> > since
> > > > last december without any reboots or problems.
> > > >
> > > > Martijn
> > > >
> > > > On Nov 4, 2007, at 11:48 AM, Graham Dumpleton wrote:
> > > >
> > > > > Hmmm, I do remember vaguely hearing questions about PIL thread
> > safety
> > > > > before, so it might be an issue. :-(
> > > > >
> > > > > Graham
> > > > >
> > > > > On 04/11/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > >>> BTW, are you still using mod_python 3.1.4?
> > > > >>
> > > > >> No, we went to apache 2.2.6/ mod_python 3.3.1 combination.
> > > > >>
> > > > >> I will get MySQLdb to work thread-safely, but with PIL I am not
> > so
> > > > >> optimistic...
> > > > >>
> > > > >>> -----Original Message-----
> > > > >>> From: Graham Dumpleton
> > > [mailto:graham.dumpleton at gmail.com]
> > > > >>> Sent: Sunday, November 04, 2007 2:33 AM
> > > > >>> To: Alec Matusis
> > > > >>> Cc: mod_python at modpython.org
> > > > >>> Subject: Re: [mod_python] mod_python or apache scalability?
> > > > >>>
> > > > >>> BTW, are you still using mod_python 3.1.4? Older mod_python
> > > > versions
> > > > >>> have various bugs and you would be much better upgrading to
> > 3.3.1
> > > > if
> > > > >>> you haven't already.
> > > > >>>
> > > > >>> Graham
> > > > >>>
> > > > >>> On 04/11/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > >>>>> Which indicates that some third party package you are using
> > is
> > > > not
> > > > >>>>> thread safe. This can occur if you are also using PHP as
> > > > >>>>> various of
> > > > >>>>> its third party packages are not thread safe.
> > > > >>>>
> > > > >>>> We are not using PHP, only python with mod_python.
> > > > >>>> The only third party packages we use are MySQLdb adapter and
> > PIL
> > > > >>> image
> > > > >>>> library.
> > > > >>>>
> > > > >>>> So I guess both PIL and MySQLdb have these problems.
> > > > >>>>
> > > > >>>>
> > > > >>>>> -----Original Message-----
> > > > >>>>> From: Graham Dumpleton
> > > [mailto:graham.dumpleton at gmail.com]
> > > > >>>>> Sent: Sunday, November 04, 2007 2:15 AM
> > > > >>>>> To: Alec Matusis
> > > > >>>>> Cc: mod_python at modpython.org
> > > > >>>>> Subject: Re: [mod_python] mod_python or apache scalability?
> > > > >>>>>
> > > > >>>>> On 04/11/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > >>>>>>> FWIW, I personally would try and move from prefork to
> > worker
> > > > >>> MPM as
> > > > >>>>>>> the number of Apache child processes you are running with
> > is to
> > > > >>> my
> > > > >>>>>>> mind excessive. Using worker would certainly drop memory
> > usage
> > > > >>> for
> > > > >>>>> a
> > > > >>>>>>> start as you wouldn't need as many child processes to be
> > > > >>> started.
> > > > >>>>>>
> > > > >>>>>> I am just following up on this, since we tried worker MPM
> > this
> > > > >>>>> weekend.
> > > > >>>>>> On our dev/stage it worked perfectly.
> > > > >>>>>> On live, worker MPM freed up about 2GB of memory compared to
> > > > >>> prefork.
> > > > >>>>>> However, on live, it turned out to be unstable.
> > > > >>>>>> This is what we say in the main error log:
> > > > >>>>>>
> > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20347 exit
> > signal
> > > > >>>>> Segmentation
> > > > >>>>>> fault (11)
> > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20460 exit
> > signal
> > > > >>>>> Aborted (6)
> > > > >>>>>> [Sun Nov 04 01:37:45 2007] [notice] child pid 20515 exit
> > signal
> > > > >>>>> Segmentation
> > > > >>>>>> fault (11)
> > > > >>>>>> *** glibc detected *** double free or corruption (!prev):
> > > > >>>>> 0x0000000000762c50
> > > > >>>>>> ***
> > > > >>>>>> *** glibc detected *** double free or corruption (!prev):
> > > > >>>>> 0x000000000075d100
> > > > >>>>>> ***
> > > > >>>>>> [Sun Nov 04 01:37:46 2007] [notice] child pid 20152 exit
> > signal
> > > > >>>>> Segmentation
> > > > >>>>>> fault (11)
> > > > >>>>>>
> > > > >>>>>> In the application log, we saw two types of errors:
> > > > >>>>>>
> > > > >>>>>> MySQLError: Connection to database failed
> > > > >>>>>> OperationalError: (2006, 'MySQL server has gone away')
> > > > >>>>>>
> > > > >>>>>> and
> > > > >>>>>>
> > > > >>>>>> IOError: image file is truncated (44 bytes not processed)
> > > > >>>>>>
> > > > >>>>>> The first type has to do with MySQLdb module, but the second
> > one
> > > > >>>>> occurred
> > > > >>>>>> when large images were uploaded.
> > > > >>>>>
> > > > >>>>> Which indicates that some third party package you are using
> > is
> > > > not
> > > > >>>>> thread safe. This can occur if you are also using PHP as
> > > > >>>>> various of
> > > > >>>>> its third party packages are not thread safe. Also ensure
> > that
> > > > you
> > > > >>> are
> > > > >>>>> using the latest available Python database adapters and that
> > they
> > > > >>> are
> > > > >>>>> compiled against thread safe reentrant libraries.
> > > > >>>>>
> > > > >>>>> Graham
> > > > >>>>>
> > > > >>>>>> We had to revert to prefork as a result of this.
> > > > >>>>>>
> > > > >>>>>> On another note, I managed to empirically find the maximum
> > > > >>>>> ServerLimit for
> > > > >>>>>> prefork, before the machine dies from swapping.
> > > > >>>>>> It is 380 with 4GB RAM.
> > > > >>>>>>
> > > > >>>>>>> -----Original Message-----
> > > > >>>>>>> From: Graham Dumpleton
> > > [mailto:graham.dumpleton at gmail.com]
> > > > >>>>>>> Sent: Monday, October 01, 2007 6:47 PM
> > > > >>>>>>> To: Alec Matusis
> > > > >>>>>>> Cc: mod_python at modpython.org
> > > > >>>>>>> Subject: Re: [mod_python] mod_python or apache scalability?
> > > > >>>>>>>
> > > > >>>>>>> On 01/10/2007, Alec Matusis <matusis at matusis.com> wrote:
> > > > >>>>>>>> in the apache error log. We also got
> > > > >>>>>>>>
> > > > >>>>>>>> kernel: possible SYN flooding on port 80. Sending cookies.
> > > > >>>>>>>>
> > > > >>>>>>>> in /var/log/messages system log.
> > > > >>>>>>>
> > > > >>>>>>> Have you determined for certain that you aren't the target
> > of
> > > > >>> an
> > > > >>>>>>> external SYN Flood DOS attack?
> > > > >>>>>>>
> > > > >>>>>>> Do a Google search for 'kernel: possible SYN flooding on
> > port
> > > > >>> 80.
> > > > >>>>>>> Sending cookies' and you will find lots of stuff to read.
> > Your
> > > > >>>>> running
> > > > >>>>>>> out of or having a large number of socket connections may
> > be
> > > > >>>>>>> symptomatic of a large number of half open connections
> > being
> > > > >>>>> created
> > > > >>>>>>> and then being left in TIME_WAIT. Thus perhaps do some
> > better
> > > > >>>>> analysis
> > > > >>>>>>> of socket connection states using netstat. If not a SYN
> > Flood,
> > > > >>> then
> > > > >>>>>>> possibly follow some of the other suggestions in the pages
> > you
> > > > >>> will
> > > > >>>>>>> find when you do the search.
> > > > >>>>>>>
> > > > >>>>>>> FWIW, I personally would try and move from prefork to
> > worker
> > > > >>> MPM as
> > > > >>>>>>> the number of Apache child processes you are running with
> > is to
> > > > >>> my
> > > > >>>>>>> mind excessive. Using worker would certainly drop memory
> > usage
> > > > >>> for
> > > > >>>>> a
> > > > >>>>>>> start as you wouldn't need as many child processes to be
> > > > >>> started. I
> > > > >>>>>>> wouldn't be concerned about running out of threads as when
> > > > >>> running
> > > > >>>>>>> worker I wouldn't suggest more than 25 threads per process
> > as a
> > > > >>>>>>> starting point anyway. If your mod_python application was
> > > > >>> creating
> > > > >>>>>>> lots of threads, you are likely to hit the thread limit
> > with
> > > > >>>>> prefork
> > > > >>>>>>> and not just worker so which MPM is used shouldn't be an
> > issue
> > > > >>> in
> > > > >>>>> that
> > > > >>>>>>> case.
> > > > >>>>>>>
> > > > >>>>>>> BTW, what operating system are you using?
> > > > >>>>>>>
> > > > >>>>>>> Graham
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>
> > > > >>>>
> > > > >>
> > > > >>
> > > > > _______________________________________________
> > > > > Mod_python mailing list
> > > > > Mod_python at modpython.org
> > > > >
> > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > > >
> > > > _______________________________________________
> > > > Mod_python mailing list
> > > > Mod_python at modpython.org
> > > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Mod_python mailing list
> > > Mod_python at modpython.org
> > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > >
> > >
> > _______________________________________________
> > Mod_python mailing list
> > Mod_python at modpython.org
> > http://mailman.modpython.org/mailman/listinfo/mod_python
>
>


More information about the Mod_python mailing list