[mod_python] Storing large amounts of data in RAM

Graham Dumpleton graham.dumpleton at gmail.com
Wed Oct 24 00:14:20 EDT 2007


On 24/10/2007, Aaron Gallagher <habnabit at gmail.com> wrote:
>
> I'm working on a web-application port of a text game, and I'm just wondering
> what the best way for me to implement this is.
>
> Game data is stored in a single python object which is pickleable. I'm not
> sure of the exact size of the data in memory, but the pickled files are
> between 600 and 900k uncompressed. I would like to store each object in
> memory rather than saving and loading the object for each request. So, what
> is the best way to do this? I'm pretty sure I remember from earlier posts
> that it's not possible to have memory that's shared between multiple
> interpreters. Is it possible to only use one interpreter and keep all of the
> data objects in that interpreter?

I think you mean share between the same named interpreter in different
processes such as would be used on UNIX systems when using Apache. On
Windows, there is only one process so no problem there.

So, unless you are going to use Windows only, then the multiprocess
nature of Apache will cause you issues.

One option, depending on the complexity of your data, may be to use
memcached and store data in shared memory accessible to all.

> Also, other than having a cron job send an HTTP request to the server
> intermittently, is there a way to have data objects save themselves to a
> file if their player hasn't made a request in a certain amount of time?

You could use a distinct thread created the first time a request comes
in for the application. This though is also problematic in a
multiprocess web server as although one process may not have had a
request for a while, others may have.

If you really need everything to be done in one process, one option is
to run a backend Python server process which embeds an XML-RPC server
and have all your web interfaces use XML-RPC requests to the game
engine in that to do things.

Another option is to not use mod_python, instead base your web
application on a framework capable of being hosted on any WSGI server.
Then run this in standalone Python server process and use Apache to
proxy to it.

Basing it on a WSGI server, you could also then use daemon mode of
mod_wsgi but run with a single managed process. That way you don't
have the hassle of having to work out how to run up the backend
process as Apache will manage that for you.

For further information on data sharing issues in relation to
process/threading model of Apache see:

  http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading

This talks about it in the context of mod_wsgi, but is more up to date
and accurate than older version of document I did for mod_python.

Graham


More information about the Mod_python mailing list