[mod_python] Re: Mod_python Digest, Vol 55, Issue 19

Graham Dumpleton graham.dumpleton at gmail.com
Mon Oct 15 22:09:06 EDT 2007


On 16/10/2007, Thimios Katsoulis <thkatsou at yahoo.gr> wrote:
> On 15/10/2007, Thimios Katsoulis <thkatsou at yahoo.gr> wrote:
> >
> > >Hmm, I not sure if this is an answer to your own question or you are
> > >replying to a really old post, as I don't remember the question.
> >
> > >Either way, the code being presented as an example on the way to solve
> > >this is unsafe for a number of reasons.
> >
> > >The first reason is that it isn't thread protected and thus isn't safe
> > >in a multithread process such as when Apache UNIX worker MPM or
> > >Windows winnt MPM is used. This is because multiple handlers could be
> > >executing concurrently and both decide the global database handle
> > >needs to be created.
> >
> >
> > What you mean by multiple handlers ? Multithreaded apache that has multiple threads
> > in each process? So in this case global variables are shared
> >  between different threads in the same process
> > right?
>
> >Correct, and if code running in multiple threads try and set or update
> >the global data they can interfere with each other. Standard
> >multithread programming issues.
>
> By the way is this the default behavior of apache and mod_python meaning:
>   + multiple mod_python threads  that each run  a single application (mod_python handler)
> or maybe
> Multiple apache processes +  multiple  mod_python threads  even more than one per application (virtual host)?

You only have threads when using an Apache MPM that uses multithreading. See:

  http://www.dscpl.com.au/wiki/ModPython/Articles/TheProcessInterpreterModel

or for an updated version, albeit focused around mod_wsgi, read:

  http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading

> Is there a way that we can determine hoe we run ?
> Perhaps through apache.mpm_query?
> Noticing in mpm_query :
> AP_MPMQ_IS_THREADED        = 2  # MPM can do threading

For an example of how the WSGI environment flags are calculated in a
mod_python WSGI adapter see:

  http://www.aminus.org/blogs/index.php/fumanchu/2005/11/06/wsgi_wrapper_for_mod_python

So yes, AP_MPMQ_IS_THREADED does come in to play.

> Does this mean we run in multiple threads per apache process mode?

Yes, a single handler may have multiple threads running concurrently
within it for different requests.

> And asking again: Running in multiple threads means
> even  more than one thread per mod_python virtual host, or just a thread for each
> virtual host(that is handled by a mod_python handler)?

The Apache thread pool is shared across all requests regardless of
what VirtualHost definitions exist or how many mod_python handlers may
be used.

> And something else : multiple threads share only global variables or also
> other kinds of variables class members for example?

It is all exactly the same as if you were running a standard Python
program which used multiple threads.

> > >This could result in wasted resources.
> >
> > Wasted resources? Pls explain.
>
> >If the Python destructor of a class object instance doesn't release
> >resources correctly, for example, doesn't close off database
> >connections, then they will become unreachable and just consume and
> >thus reduce the maximum number of connections you could use. Even if
> >the destructor does do the correct thing, if there is an object
> >reference count cycle, Python may not be able to garbage collect it
> >and so it may never be destroyed anyway and release the resources.
>
> "object reference count cycle"?.
> >From what I 've noticed on module reloading the global connection object is recreated.
> So the number of open connections to DB (postgres in my case) increments 1, but after a while the old connection closes (python garbage collecting, postgresql?) and so there is no problem.
>
> Of course it will still be a problem if multiple processes are created and the number of open connections reach the database connection limit.

Not in the same way as that would be by design, whereas the resource
leakage isn't.

> Just have to notice that this whole thing with keeping the connection open speeds a mod_python/ psycopg2/ postgresql  web app very much.
> In my case noticed speed ups up to 300%.
>
> Anyway I think it would be very handful a tutorial about
> "A global's life in mod_python" covering topics like the global variable context.

For a start, see document on process/interpreter model. Other than
that, not much different to standard Python programs.

> > >A second possible problem which can occur is where the global database
> > >handle and the code for creating it, is placed in a code module which
> > >is the subject of automatic reloading.
> >
> > I have now realized that in every reload of the file even if the global variable
> >  was created in previous invocation  it is re-created.
> > But why ? Once it's created as global shouldn't it exist after first invocation of the module
> >  and thus not be recreated ?
>
> >Because that is the consequences of reloading modules, you either
> >reload on top of the existing module and thus replace the existing
> >value,
>
>
> Sorry don't get you. An example please.
> A global variable's context is the module where it is created?
> If you reload this module the global variable becomes not reachable?

Consider:

  a = Class()
  a = Class()

The first instance of 'a' is no longer reachable.

If there was a reference count cycle then it may not be deleted
immediately and will be be dependent on garbage collection cycle
kicking in and attempting to break the cycle before it can be deleted.
This may not occur immediately and thus the unreachable object
instance may be using up resources. In worst case, garbage collector
may not be able to break cycle and it will never be destroyed and so
resources will never be released.

> >or you reload the module into a fresh module and discard the
> >old.
>
> Are you pointing a technique here?

Yes and this is the one used by mod_python. The reasons were explained
in document on why mod_python module importing was broken in older
versions.

Graham

> >How are you going to do it so you retain the global data in a
> >generic way?
>
> >Graham
>
> > > So, once threading issues are
> > >dealt with then okay, but do not go putting it in the same code file
> > >as your mod_python handlers, or in general anywhere in the document
> > >tree. Instead put it in a module somewhere else on sys.path outside of
> > >the document tree. For an example of why this can be a problem see:
> >
> > >  http://groups.google.com/group/sqlalchemy/browse_thread/thread/5193bc7598f045fb#
> >
> > >Also ensure you read documentation for import_module() in:
> >
> >  > http://www.modpython.org/live/current/doc-html/pyapi-apmeth.html
> >
> > >It mentions a bit about resource leakage and transferring data from
> > >old module to new module. Ensure you are using mod_python 3.3.1
> > >though, as older versions were a bit less predictable.
> >
> > >  http://www.dscpl.com.au/wiki/ModPython/Articles/ModuleImportingIsBroken
> >
> > Graham
> >
> > On 14/10/2007, Thimios Katsoulis <thkatsou at yahoo.gr> wrote:
> > > Hello.
> > >
> > > Sorry for my poor english. Here is my mod_python code. This is a simple
> > > URL shorter:
> > >
> > > from mod_python import apache, util
> > > import psycopg2 as psycopg
> > >
> > > def handler(req):
> > >      req.content_type = "text/plain"
> > >      url_id = req.args
> > >      connection = psycopg.connect("dbname=my_db")
> > >      cursor = connection.cursor()
> > >      cursor.execute("""SELECT myurl FROM urls WHERE myid=%s""",(url_id))
> > >      original_url = cursor.fetchone()[0]
> > >      connection.close()
> > >      util.redirect(req,original_url)
> > >      req.status = apache.DONE
> > >      return apache.DONE
> > >
> > > This programme connects database everytime, but I want(need) force him
> > > to connect it continously
> > >
> > > This script connects database everytime (psycopg.connect("dbname=my_db")
> > > , but I nedd force to stay conneceted with it continously. It is
> > > possible to make that in mod_python ? I have mod_python 3.2.10
> > >
> > > Thanks in advance.
> > > rdn
> > > ---------------
> > >
> > > Yes mod_python can maintain global variables per interpreter, so you can open once the connection
> > > and all subsequent requests will use the opened connection.
> > > You have to declare your connection variable as global e.g. :
> > >
> > >
> > >     def getConn(self):
> > >
> > >         try:
> > >             if _conn == None:
> > >                 self.openConn()
> > >
> > >
> > >         except NameError:
> > >             self.openConn()
> > >         return _conn
> > >
> > >     def openConn(self):
> > >
> > >         global _conn
> > >         _conn = connect(self.ConnStr)
> > >
> > >
> > > So when you want to access it you call self.getConn().
> > > The _conn global variable will instantiate once for each mod_python apache process.
> > > You can include some apache.log_error calls in the code above to watch
> > > for yourself when the  _conn varible gets instantiated and when it is retreived as global.
> > > Please take notice from what I have observed that while you maintain open connections to DB (postgresql too in my case)
> > > you cannot change structure of the tables etc in the DB..
> > > Of course if you are using modules and not objects you have to alter  the code to  support (removing self ..) modules.
> > >
> > >
>
>
>
>
>
>
>
> ___________________________________________________________
> Χρησιμοποιείτε Yahoo!;
> Βαρεθήκατε τα ενοχλητικά μηνύματα (spam); Το Yahoo! Mail
> διαθέτει την καλύτερη δυνατή προστασία κατά των ενοχλητικών
> μηνυμάτων http://login.yahoo.com/config/mail?.intl=gr
>
>
>
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
>



More information about the Mod_python mailing list