[mod_python] Apache, Threading and Multi-Processing Modules

kevin douglas fitnah55 at hotmail.com
Wed Jun 11 15:40:16 EST 2003


as far as i understand, apache2 was supposed to fix and/or make possible
the memory pooling issue... i don't think MMDBMS w/ an effective apache
interface is a reality yet, but that's certainly going to be the
intent..

for connection pooling and other stuff you obviously have to design and
write proper multiplexing code to allocate your resources effectively,
but as far as windows goes i wouldn't have a clue and wouldn't want to

I do believe that there is one main interpreter in mod_python and each
thread has sub-interpreters.. Although with the new worker model it
might
be one main interpreter per child and one sub-interpreter per thread..
regardless you can munge it to work all in one interpreter or a couple
other ways with the config options which should allow memory pooling to
some extent

mysql just got $15M in financing though so I don't think a MMDBMS module
from them is going to be free, if it even ever becomes available

python mapping objects are pretty easy to cache though, as long as you
don't
have to manage consistency across disparate processes

just my two cents :p




-----Original Message-----
From: mod_python-bounces at modpython.org
[mailto:mod_python-bounces at modpython.org] On Behalf Of Jonathan Gardner
Sent: Wednesday, June 11, 2003 2:53 PM
To: Paul Robinson; Mod_python at modpython.org
Subject: Re: [mod_python] Apache, Threading and Multi-Processing Modules


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 11 June 2003 09:45, Paul Robinson wrote:
> Apache has a number of modes of operation when it comes to threading 
> and forking, I would like to understand how these things interact with

> Python subinterpreters 
> [http://www.modpython.org/live/current/doc-html/pyapi-interps.html] 
> and issues such as the Python global interpreter lock (GIL) 
> [http://www.python.org/doc/current/api/threads.html].
>

First off, think of each child process as an entirely seperate process.
There 
is *no* *way* that any process can communicate with each other except
through 
shared memory or pipes. I am no expert on the inner workings of
mod_python, 
but reading the documentation it sounds like each process is entirely 
independent of each other. Each process can have a number of 
"subinterpreters" based on the configuration, but these subinterpreters
are 
isolated from one another as well.

As far as GIL is concerned, you really shouldn't be concerned about that
at 
all. That is there just to ensure that no thread is caught with its
pants 
down. Or, in more technical terms, that the state of the python
interpreter 
and associated data is always consistent when there is no lock.

> For example, on a Windows platform where there is a single 
> multi-threaded Apache process (mpm_wint
> [http://httpd.apache.org/docs-2.0/mod/mpm_winnt.html]) is it correct 
> to say that mod_python would not be able to take advantage of a 
> multi-processor machine due to the GIL?
>

I don't know the details of how Windows machines handle threads, but I
do know 
that threads are like "lightweight" processes. They can and will be run
on 
seperate processors on a normal OS.

Whether or not each thread can communicate with each other -- the
impression I 
get from the documentation is that this is not so. It sounds like each
thread 
will have their own main interpreter, and a number of sub-interpreters 
depending on the configuration. This means that there is no way to 
communicate among threads via Python, as the Python main interpreters
are 
seperate.

> In another, given Apache running in the prefork MPM
> [http://httpd.apache.org/docs-2.0/mod/prefork.html]- is it a) possible

> or b) useful to have a global, per-Apache-process persitant data 
> strucuture sharing a pool of (threadsafe) database connections. I 
> would say not useful since that process will only ever be running a 
> single mod_python request at a time - hence more than one item in the 
> pool would be useless. Given the "worker MPM" 
> [http://httpd.apache.org/docs-2.0/mod/worker.html] however it may be 
> useful but it's not clear to me if it would be possible.
>

I don't think this is possible.

> Taking the specific example of database connections (let me note I 
> have read and believe I understand FAQ 3.3) is it ever useful or 
> possible to share a pool of database connectors, rather than a single 
> connector in the global namespace. I assume that code such as that in 
> FAQ 3.3 would require additional locking mechanisms in order to 
> function correctly in a multi-threaded Apache environment?
>

Within a single apache thread and process, yes, you can share database 
connections. If your handler decides to thread while processing a
request, 
then it can share with the same database connections in that apache
thread.

However, I don't think what you really want (independent processes or
threads 
sharing connections) is possible.

> I bet there must be some code in existing projects that does stuff 
> like this. Any pointers?
>

Sorry, I looked into this on my own, both with mod_perl and mod_python,
and 
there is nothing out there that I could see.

The best solution is to keep the connection alive, and reuse it for new 
incoming requests. If the database doesn't like having so many open and 
inactive connections, you can just hangup at the end of the request, and

connect at the beginning of the request. Some databases have more
overhead 
than others.

Remember I said that the only way to talk between processes is via
shared 
memory or pipes. Shared memory isn't supported well (if at all) in
python. 
Pipes are something you already are familiar with -- TCP sockets are
pipes 
between two processes that can be located on different servers.

So another solution that I have thought of but have no reason to
implement is 
a database connection pool server. In this scenario, you would get a 
connection to the database server by connecting to the connection pool 
server. After the initial connection, the connection server just relays
your 
commands word for word to the database. When you disconnect, it puts the

connection back into the pool.

This isn't too far different from a session server, or other kinds of 
meta-servers. The main stink I have with these is that servers are a
pain in 
the butt to write right, and they are always a nightmare to manage. And
you 
always have to have a plan for scaleability, or it will eventually bite
you.

> Maybe I'm confusing myself at the moment - maybe some other people as 
> well ;-)
>

I found your message to be extremely precise in its wording, with plenty
of 
useful references. That was both helpful and refreshing.

- -- 
Jonathan Gardner <jgardner at jonathangardner.net>
(was jgardn at alumni.washington.edu)
Live Free, Use Linux!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+53pvWgwF3QvpWNwRAr5nAKDNvpjSXZ4+0GSWQWh11V2EdbhvjACgyAmP
kvdSO3JZYSfwDGo1XI3JOvY=
=IQH6
-----END PGP SIGNATURE-----

_______________________________________________
Mod_python mailing list
Mod_python at modpython.org
http://mailman.modpython.org/mailman/listinfo/mod_python


More information about the Mod_python mailing list