[mod_python] Apache/mod_python process sizes. Help wanted.

Graham Dumpleton grahamd at dscpl.com.au
Sat Mar 3 22:05:30 EST 2007


On 04/03/2007, at 11:36 AM, Graham Dumpleton wrote:

>
> On 04/03/2007, at 11:31 AM, Jim Gallacher wrote:
>
>>> Part of the jump is the imports in mod_python.apache:
>>> import sys
>>> import traceback
>>> import time
>>> import os
>>> import pdb
>>> import stat
>>> import imp
>>> import types
>>> import cgi
>>> import _apache
>>> import threading
>>> I had been thinking they would have been done when mod_python loaded
>>> for first interpreter, but they aren't done until time of first  
>>> request.
>>> If I import the same modules into mod_wsgi example, then get (pid  
>>> 429):
>>>   432 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
>>> 904K  30.4M
>>>   430 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
>>> 776K  30.4M
>>>   429 httpd        0.0%  0:00.18   1    10    67  1.95M  2.88M   
>>> 3.33M  31.5M
>>>   428 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
>>> 780K  30.4M
>>>   427 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
>>> 780K  30.4M
>>>   426 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
>>> 780K  30.4M
>>>   425 httpd        0.0%  0:00.07   1    11    39    32K  2.87M   
>>> 1.34M  30.4M
>>> Thus these modules alone causes it to jump up to 1.95M. This  
>>> works out
>>> about an extra 1.4MB. Because mod_wsgi is all in C code aren't  
>>> relying on
>>> any Python modules so base overhead is less.
>>> Interesting. How many of these modules could be avoided in  
>>> mod_python
>>> if we tried hard?
>>
>> I have no idea, but it's worth a look.
>
> It is quite easy to get rid of 'cgi' as all that is used in it is  
> cgi.escape(). We can
> just duplicate that one function:
>
> def escape(s, quote=None):
>   s = string.replace(s,"&","&")
>   s = string.replace(s,"<","&lt;")
>   s = string.replace(s,">","&gt;",)
>   if quote:
>     s = string.replace(s,'"',"&quot;")
>   return s
>
> We also do not need to load 'pdb' at global scope. We can instead  
> do it only
> at point that it is required. Ie., if PythonEnablePdb directive  is  
> enabled.
>
> Those two alone drop my 1.95M figure down to 0.8MB.
>
>   510 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
> 904K  30.4M
>   509 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
> 792K  30.4M
>   508 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
> 796K  30.4M
>   507 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
> 796K  30.4M
>   506 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
> 796K  30.4M
>   505 httpd        0.0%  0:00.04   1    10    42   812K  2.80M   
> 2.01M+ 30.5M
>   504 httpd        0.0%  0:00.06   1    11    39    32K  2.87M   
> 1.39M  30.4M
>
> The rest can't probably be eliminated and gains would be a lot  
> less. The
> 'cgi' module is always a big memory hog and slow to load, so if can  
> get
> rid of that, all the better. Would need to go from 'apache',  
> 'importer' and
> 'psp' mod_python modules.

On mod_python itself, after eliminating 'cgi' module use from  
'apache', 'psp'
and 'importer', plus 'urllib2', 'rfc822', 'calendar' and 'weakref'  
from 'cache'
and deferring import of 'pdb', have gone from my original mod_python
figure of:

   410 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
888K  30.4M
   408 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
764K  30.4M
   407 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
768K  30.4M
   406 httpd        0.0%  0:00.00   1     8    39   216K  2.87M    
768K  30.4M
   405 httpd        0.0%  0:00.37   1    10    86  2.40M  2.75M   
3.94M  32.0M
   404 httpd        0.0%  0:00.00   1     8    39   212K  2.87M    
768K  30.4M
   403 httpd        0.0%  0:00.06   1    11    39    32K  2.87M   
1.30M  30.4M

to:

3355 httpd        0.0%  0:00.00   1     8    39   212K  3.09M   900K   
30.4M
3353 httpd        0.0%  0:00.00   1     8    39   212K  3.09M   768K   
30.4M
3352 httpd        0.0%  0:00.00   1     8    39   212K  3.09M   768K   
30.4M
3351 httpd        0.0%  0:00.00   1     8    39   212K  3.09M   768K   
30.4M
3350 httpd        0.0%  0:00.14   1    10    67  1.36M  3.09M  2.70M   
31.0M
3349 httpd        0.0%  0:00.00   1     8    39   212K  3.09M   772K   
30.4M
3348 httpd        0.0%  0:00.08   1    11    39    36K  3.09M  1.32M   
30.4M

Thus have eliminated 1.0MB from base Apache child process size.

Obviously if an application is using any of these modules anyway,  
they'll
just get loaded back again later.

To drop some of the stuff from 'cache' had to comment out stuff like the
HTTPCache etc. These bits of that file aren't used by mod_python though
and were only in there to preserve the full content of original file  
when it
was borrowed from elsewhere. When full cutover to new module importer
done that file was going to disappear anyway.

All well and good, but this is distracting from my original goal of  
the static
versus shared library issue. :-)

Graham


More information about the Mod_python mailing list