|
Graham Dumpleton
grahamd at dscpl.com.au
Fri Jul 21 19:28:13 EDT 2006
On 22/07/2006, at 6:53 AM, Rob Miller wrote:
> hi,
>
> i know python pretty well, but i'm a mod_python (mod_*, really)
> beginner, and i'm having some trouble getting an output filter to
> work for me. i'm trying to apply a filter to content that is being
> served from a CMS and proxied through apache using RewriteRules.
> i've got it working, sort of, but there are a couple of major snags
> i've hit:
>
> - the content coming from the other system doesn't necessarily have
> a file extension. i can use 'SetOutputFilter" to apply the filter
> to everything, but i don't WANT to apply it to images or other
> binary data. is there a way to use "AddOutputFilter" for files
> with no extension? or to use it as an extension blacklist, rather
> than a whitelist? or would this have to be done as logic in the
> filter itself?
It is probably easier to make the determination in the filter itself.
Rather than:
try:
streambuffer = filter.req.streambuffer
except AttributeError:
filter.req.streambuffer = StringIO()
streambuffer = filter.req.streambuffer
have something like:
try:
streambuffer = filter.req.streambuffer
except AttributeError:
# first time into the filter for this request
# pass on stuff we don't want to deal with
if filter.req.notheme:
# pass on if no theme
filter.pass_on()
return
elif not filter.req.headers_out.has_key("content-type):
# pass on if no content type specified
filter.pass_on()
return
elif not filter.req.headers_out["content-type"].startswith
("text/html")
# pass on if not HTML
filter.pass_on()
return
filter.req.streambuffer = StringIO()
streambuffer = filter.req.streambuffer
Later on, you can also change:
if filter.req.notheme:
filter.write(streambuffer.getvalue())
else:
filter.write(appmap.publish(streambuffer.getvalue()))
to just:
filter.write(appmap.publish(streambuffer.getvalue()))
as you have already passed on control to next filter in chain for
stuff you do not
want.
> - when i apply the filter to static content coming from a hard
> drive, it works very well. when i apply it to content from the
> CMS, however, it is extremely slow. a single page can take
> anywhere from 15 to 45 seconds to return. (note that if i browse
> directly to the CMS the page returns are also quite fast.) it seems
> like a lot of information comes down right away but firefox churns
> as though it's still waiting. when i use wget, the page seems to
> get requested over and over again, with wget never realizing it's
> done. my guess is it has something to do w/ the content-length
> header, but i've deleted it from request before writing to the
> filter object as shown in the examples i've found.
Add logging in your filter so track how long different things take.
Ie., sprinkle in:
filter.req.log_error("timestamp %d %f" %
(filter.req.connection.id,time.time()))
If you want the content length put back, setup Apache to pass the
output through
the "CONTENT_LENGTH" filter as well. Because you accumulate all the
data into
one block it will work. You could also just calculate it yourself and
add it back.
> i'm using mod_python 3.1.4 and apache 2.0.55 from ubuntu dapper.
> the code of my filter is here: http://codespeak.net/svn/z3/
> deliverance/branches/namespaced/mpfilter.py
>
> anyone have suggestions, or pointers to docs, that might help me?
Reading the general Apache filter FAQ may or may not be useful. It
isn't mod_python
specific, but explains how it works underneath.
http://www.projectcomputing.com/resources/apacheFilterFAQ/
Graham
|