Graham Dumpleton
grahamd at dscpl.com.au
Fri Jul 21 19:28:13 EDT 2006
On 22/07/2006, at 6:53 AM, Rob Miller wrote: > hi, > > i know python pretty well, but i'm a mod_python (mod_*, really) > beginner, and i'm having some trouble getting an output filter to > work for me. i'm trying to apply a filter to content that is being > served from a CMS and proxied through apache using RewriteRules. > i've got it working, sort of, but there are a couple of major snags > i've hit: > > - the content coming from the other system doesn't necessarily have > a file extension. i can use 'SetOutputFilter" to apply the filter > to everything, but i don't WANT to apply it to images or other > binary data. is there a way to use "AddOutputFilter" for files > with no extension? or to use it as an extension blacklist, rather > than a whitelist? or would this have to be done as logic in the > filter itself? It is probably easier to make the determination in the filter itself. Rather than: try: streambuffer = filter.req.streambuffer except AttributeError: filter.req.streambuffer = StringIO() streambuffer = filter.req.streambuffer have something like: try: streambuffer = filter.req.streambuffer except AttributeError: # first time into the filter for this request # pass on stuff we don't want to deal with if filter.req.notheme: # pass on if no theme filter.pass_on() return elif not filter.req.headers_out.has_key("content-type): # pass on if no content type specified filter.pass_on() return elif not filter.req.headers_out["content-type"].startswith ("text/html") # pass on if not HTML filter.pass_on() return filter.req.streambuffer = StringIO() streambuffer = filter.req.streambuffer Later on, you can also change: if filter.req.notheme: filter.write(streambuffer.getvalue()) else: filter.write(appmap.publish(streambuffer.getvalue())) to just: filter.write(appmap.publish(streambuffer.getvalue())) as you have already passed on control to next filter in chain for stuff you do not want. > - when i apply the filter to static content coming from a hard > drive, it works very well. when i apply it to content from the > CMS, however, it is extremely slow. a single page can take > anywhere from 15 to 45 seconds to return. (note that if i browse > directly to the CMS the page returns are also quite fast.) it seems > like a lot of information comes down right away but firefox churns > as though it's still waiting. when i use wget, the page seems to > get requested over and over again, with wget never realizing it's > done. my guess is it has something to do w/ the content-length > header, but i've deleted it from request before writing to the > filter object as shown in the examples i've found. Add logging in your filter so track how long different things take. Ie., sprinkle in: filter.req.log_error("timestamp %d %f" % (filter.req.connection.id,time.time())) If you want the content length put back, setup Apache to pass the output through the "CONTENT_LENGTH" filter as well. Because you accumulate all the data into one block it will work. You could also just calculate it yourself and add it back. > i'm using mod_python 3.1.4 and apache 2.0.55 from ubuntu dapper. > the code of my filter is here: http://codespeak.net/svn/z3/ > deliverance/branches/namespaced/mpfilter.py > > anyone have suggestions, or pointers to docs, that might help me? Reading the general Apache filter FAQ may or may not be useful. It isn't mod_python specific, but explains how it works underneath. http://www.projectcomputing.com/resources/apacheFilterFAQ/ Graham
|