[mod_python] Session class or Memcache to reduce load?

Graham Dumpleton graham.dumpleton at gmail.com
Tue Aug 7 04:42:12 EDT 2007


On 07/08/07, Alec Matusis <matusis at matusis.com> wrote:
> Hi Colin, thanks for the tip.
>
> I converted to req.sendfile, and then we run into issues with browser
> caching of these thumbnail images.

The problem may be that if you URL actually matches to a physical file
which is the handler, then Apache may be using attributes of that
handler file in response headers.

If you are having issues with caching occurring when you don't want it
to happen, set:

  req.no_cache = 1

This should cause Apache to set appropriate headers to prevent caching
by intermediaries or browser.

> When I use req.sendfile, is there a way to retrieve the last modification
> time of the image file automatically (without using os.stat(image_path))

No way of avoiding the stat() call when using req.sendfile().

> and
> send it to the client in Last-Modified header?

In preference to setting header directly, see req.update_mtime() and
req.set_last_modified(). Also look at req.meets_conditions() in case
it is useful. These methods just reflect Apache API versions, ie.,
ap_update_mtime(), ap_set_last_modified() and ap_meets_condition(), so
you may have to try and find some documentation on Apache itself to
work out how to use them though, as mod_python documentation is pretty
sparse in that respect. All the same, make sure you look through:

  http://www.modpython.org/live/current/doc-html/pyapi-mprequest.html

> It looks like I would have to do this additional os.stat(image_path) per
> thumbnail retrieval to make correct use of If-Modified-Since client header
> to keep the browser cache current.

The past mailing list post that was referenced in respect of setting
req.filename, req.finfo and return apache.DECLINED, should allow
Apache to do a lot of this stuff for you. So perhaps give it a try and
see what it does.

Graham

> > -----Original Message-----
> > From: Colin Bean [mailto:ccbean at gmail.com]
> > Sent: Monday, August 06, 2007 2:45 PM
> > To: Alec Matusis
> > Cc: mod_python at modpython.org
> > Subject: Re: [mod_python] Session class or Memcache to reduce load?
> >
> > Hey Alec,
> >
> > For starters, you should use req.sendfile instead of manually reading
> > / sending the file in Python.
> > I believe you'd still have to set the content type manually, so you'd
> > still have to determine that.  This looks like it would be an easy
> > change to implement, and hopefully will help your performance
> > immediately.
> >
> > Graham described a potentially faster method here (with the warning
> > that it's "slightly theoretical" -- I've not tried it, but reading the
> > thread might be helpful)
> > http://www.modpython.org/pipermail/mod_python/2007-July/024061.html
> >
> > -Colin
> >
> >
> > On 8/6/07, Alec Matusis <matusis at matusis.com> wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Graham Dumpleton [mailto:graham.dumpleton at gmail.com]
> > > > Sent: Monday, August 06, 2007 2:25 AM
> > > > > Hello, I am sorry if my question is too basic. I would like to
> > reduce
> > > > the
> > > > > load on apache 2.0 (on 2.6.9 linux) that is running with prefork
> > MPM.
> > > > > There are two main things that are causing the load:
> > > >
> > > > How do you know that these two things are causing the load?
> > > >
> > >
> > > Because all this apache server (this physical machine) does is
> > serving
> > > thumbnails.
> > >
> > > > > a) Thumbnail images that are requested repeatedly
> > > >
> > > > Are you serving up the thumbnails from Python code dynamically or
> > > > allowing Apache to serve them up form a static file?
> > >
> > > Dynamically.
> > >
> > > >
> > > > > b) A simple DB query is necessary to locate an image file after
> > the
> > > > request.
> > > > > The result of the query does not change. DB is located on another
> > > > machine.
> > > > >
> > > > > My first question, do we need to cache \thumbnail images at all,
> > or
> > > > the
> > > > > file-caching by the OS is sufficient?
> > > >
> > > > OS file caching probably will not make much difference. Where one
> > can
> > > > waste a lot of cycles though is by using Python code to serve up
> > the
> > > > images. Thus, if Python is involved in serving up the images, how
> > is
> > > > it being done?
> > > >
> > >
> > > The URI in the client's request contains an image code. Python
> > queries the
> > > DB (on a separate machine) to convert the URI into the image file
> > location,
> > > and then uses
> > >
> > >         f =  open(file_path)
> > >         im_data = f.read()
> > >         f.close()
> > >
> > > then it determines image type using PIL
> > >
> > >           im = Image.open(file_path)
> > >           im_type = im.format.lower()
> > >
> > > (Image.open() is a lazy operation in PIL, so I think it does not have
> > to do
> > > much to determine the image format)
> > > Then python writes the data to the client
> > >
> > >         self.req.content_type = 'image/' + im_type
> > >         self.req.headers_out['Content-Length'] = str(len(im_data))
> > >         self.dict['bin_data'] = im_data
> > >
> > >
> > > > > Second question, to cache the results of the query, should we use
> > > > > mod_python's Session class ( wich will use DbmSession since we
> > are
> > > > using
> > > > > prefork MPM), or memcache?
> > > >
> > > > Traditionally people use memcached for this.
> > > >
> > > Why? For performance reasons?
> > >
> > > > > For certain reasons in the application logic we cannot use
> > apache's
> > > > > mod_cache.
> > > > What reasons?
> > > >
> > >
> > > Because sometimes the images are dynamically banned (flagged) by
> > users, and
> > > in that case we need to render a special image. The DB query that we
> > use
> > > gives None for the image file path when it's banned. So when the
> > image
> > > status changes, we will need to dynamically purge the file path of
> > the image
> > > that is be cached.
> > >
> > > > Graham
> > >
> > > _______________________________________________
> > > Mod_python mailing list
> > > Mod_python at modpython.org
> > > http://mailman.modpython.org/mailman/listinfo/mod_python
> > >
>
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
>


More information about the Mod_python mailing list