Julien Cigar
jcigar at ulb.ac.be
Wed Mar 29 02:21:47 EST 2006
Many thanks for all those answers ! Graham Dumpleton wrote: > A few comments on your handler. > > Julien Cigar wrote .. > >> Hello, >> >> I'm writing my first home-made handler (inspired by mod_publisher). >> > > Writing a publisher like handler which allows arbitrary traversal of objects > is non trivial. Get it wrong and you can expose yourself to potential security > holes. Be warned, don't attempt it unless you appreciate what the issues > are. ;-) > > I forgot to say that I don't want traversal of objects. > I also think you mean't mod_python.publisher. The Apache mod_publisher > module is used to perform substitutions on XML/HTML as well as > implementing an alternative to SSI of mod_include. > > >> I wanted to do a "ruby on rails"-like dispatcher, so when I call : >> http://somehost.com/foo/bar it should call the bar() method of the class >> (controller) foo which lives under controllers/ >> Please note that it is not at all complete, I'm just playing with it at >> the moment .. >> > > Why cant you just use mod_python.publisher? > > I could (I've used it for a lot of projects here). In fact I wanted to understand mod_python more deeply by writing my own handler. >> ... >> >> Here is the code of my handler: >> >> #!/usr/bin/python >> >> import sys >> >> from mod_python import apache, util >> >> sys.path.append('/home/jcigar/public_html/invasive_species/application') >> > > Modifying sys.path from handlers, even from global scope in module is > bad. It can cause problems in a multithreaded MPM if mod_python tries to > update sys.path at the same time. > > Further, if module reloading is enabled, everytime the module is > reloaded the directory will be added to sys.path meaning it will grow > and grow and grow with multiple occurrences of the same directory > appearing. > > right... I forgot that >> def handler(req): >> >> req.allow_methods(["GET", "POST"]) >> >> if req.method not in (["GET", "POST"]): >> raise apache.SERVER_RETURN, apache.HTTP_METHOD_NOT_ALLOWED >> >> (controller, method) = ("default", "index") >> >> if req.path_info: >> path_info = req.path_info.strip('/').split('/') >> >> if path_info[0].isalpha(): >> controller = path_info[0] >> >> if len(path_info) > 1 and path_info[1].isalpha(): >> method = path_info[1] >> > > Like mod_python.publisher, you have gone down the path of allowing > defaults and similarly don't deal with trailing slashes in a reasonable > way. This causes various problems due to the fact that different URLs > can be used to address the same resource. The different URLs can > have varying numbers of slashes in them, which makes calculating > relative URLs to another resource fiddly and error prone. > Is there a "good" way to handle relative urls ? > >> config = req.get_config() >> (autoreload, log) = (int(config.get("PythonAutoReload", 1)), >> int(config.get("PythonDebug", 0))) >> >> try: >> module = apache.import_module('controllers.%s' % (controller, ), >> autoreload=autoreload, log=log) >> > > Hope you don't expect autoreload to work on Python packages. It can be > unreliable at best in publically released versions of mod_python and new > module importer in future version will not support it. But then there > are better ways of doing what you are doing when new importer is > available as you will be able to load module by full path. > > so apache.import_module should not be used .. ? >> except ImportError: >> req.log_error('Cannot import controller %s' % (controller, )) >> >> object = getattr(module, method) >> > > You don't reraise the ImportError exception or return any other sort > of explicit error status. Thus, when module can't be imported, it will > still try and access the method from the module, but module variable > will not exist and so it will die. > > Yes, this handler is not complete at all (I was just playing a bit with) > Luckily you don't allow arbitrary traversal and only allow method > name to start with alphabetic character, so no issue with notional > private stuff prefixed with underscore. > > >> if callable(object): >> req.params = util.FieldStorage(req, keep_blank_values=1) >> > > It is traditional that if storing FieldStorage instance in request object > that it be stored as req.form. > > ok :) >> result = util.apply_fs_data(object, req.params, req=req) >> else: >> req.log_error('%s is not callable' % (object, )) >> return apache.HTTP_INTERNAL_SERVER_ERROR >> > > Okay, so only allowing access to callable and not acess to data. > More safe than mod_python.publisher in that respect. > > Yes, only to the function of the "controller" in fact. In the futur I'd like to do like Ruby on Rails or Django, with some regex patterns (see http://www.djangoproject.com/documentation/url_dispatch/) >> if result or req.bytes_sent or req.next: >> > > Do you understand what req.next attribute is? It probably isn't > relevant to you and how you use it is probably wrong in this > case anyway. I should probably check that how mod_python.publisher > even uses it is sensible. > > >> req.content_type = 'text/html' >> > > If data has already been sent, ie., req.bytes_sent, is non zero, > then setting content length here will not do anything as it has > to be set before first write of data. > > >> req.write(result) >> > > Your callables must always ensure they return string objects. > In mod_python.publisher, it will at least convert non string objects > to strings and attempts to treat Unicode strings specially as well. > > Problem with this code is if bytes were sent and result is None, > it will try and write result and die because it isn't a string. > > ok, so req.write(str(result)) >> return apache.OK >> else: >> req.log_error('Nothing returned') >> return apache.HTTP_INTERNAL_SERVER_ERROR >> >> >> In advance, thanks for support ! >> > > Overall, I don't see why you couldn't use mod_python.publisher. > Yes, publisher provides more features, but if you set your code > up correctly, you would still get same result as what you are > aiming for without having to create your own handler. > > Graham > > -- Julien Cigar Belgian Biodiversity Platform http://www.biodiversity.be Université Libre de Bruxelles Campus de la Plaine CP 257 Bâtiment NO, Bureau 4 N4 115C (Niveau 4) Boulevard du Triomphe, entrée ULB 2 B-1050 Bruxelles Work: jcigar at ulb.ac.be Personal: mage at mordor.ath.cx
|