Graham Dumpleton
grahamd at dscpl.com.au
Wed Oct 20 03:54:40 EDT 2004
On Oct 19 13:56, Johannes Erdfelt <johannes at erdfelt.com> wrote: > > Subject: [patch] vampire 1.1 remove unknown query variables > > I've been using vampire the last couple of days and I like it, but it > has some problems some code I originally wrote didn't have. > > This patch fixes a problem where unknown query variables (form > variables) could cause an internal server error. > > Generally, this shouldn't happen and on a well designed site, it's the > users error for causing this to occur, but it causes some undue alarm > when looking at the logs, so I wrote up this patch to remove unknown > query variables before applying it to the called handler. > > It also prints out an error message if variables are required by the > handler, but aren't given by the client (it doesn't have a default) > > It applies to vampire 1.1 > > JE > > diff -ur vampire-1.1-20041009.orig/packages/vampire/apache.py vampire-1.1-20041009/packages/ vampire/apache.py > --- vampire-1.1-20041009.orig/packages/vampire/apache.py 2004-10-08 18:31:50.000000000 -0700 > +++ vampire-1.1-20041009/packages/vampire/apache.py 2004-10-19 13:55:48.000000000 -0700 > @@ -171,4 +171,40 @@ > > # Execute the content handler. > > - return apply(function,(req,),args) > + # Match up the arguments given by the client to the expected arguments > + # from the method. We only remove non expected names and don't check for > + # expected because the argument may have a default if not set. We use > + # exceptions to catch the case where an argument does not have a default. > + fc = function.func_code > + expected = fc.co_varnames[0:fc.co_argcount] > + > + # Silently remove any unexpected arguments if we need to > + if not fc.co_flags & 0x000C: # CO_VARARGS | CO_VARKEYWORDS > + for name in args.keys(): > + if name not in expected: > + del args[name] > + > + try: > + return apply(function,(req,),args) > + except TypeError, vars: > + missing = [] > + > + # Don't worry about the arguments with defaults > + argcount = fc.co_argcount > + if function.func_defaults: > + argcount = argcount - len(function.func_defaults) > + # Skip the first argument, which is the req > + for name in fc.co_varnames[1:argcount]: > + if name not in args: > + missing.append(name) > + > + if not len(missing): > + raise > + > + # We definately had some missing variables, let's let the user know > + req.content_type = "text/plain" > + req.status = apache.HTTP_INTERNAL_SERVER_ERROR > + req.send_http_header() > + req.write("Call is missing these variables: %s\n" % ", ".join(missing)) > + > + return apache.OK After looking at this and playing with it a bit, I am actually sort of inclined to do the check even before the call of the handler even though it introduces a little bit more overhead. Ie., something like: @@ -168,6 +169,34 @@ if len(args[arg]) == 1: args[arg] = args[arg][0] + fc = function.func_code + + expected = fc.co_varnames[0:fc.co_argcount] + + if not fc.co_flags & 0x08: + for name in args.keys(): + if name not in expected: + del args[name] + + # Following identifies missing form parameters + # and returns a HTTP_BAD_REQUEST instead of a + # HTTP_INTERNAL_SERVER error. This at least gives + # a bit of an indication it was because of the + # request and not a problem with the server. + + missing = [] + + argcount = fc.co_argcount + if function.func_defaults: + argcount = argcount - len(function.func_defaults) + + for name in fc.co_varnames[1:argcount]: + if name not in args: + missing.append(name) + + if missing: + return apache.HTTP_BAD_REQUEST + # Execute the content handler. return apply(function,(req,),args) BTW, the code which drops out form parameters that weren't expected should probably only look for CO_VARKEYWORDS and not CO_VARARGS as well. Doing it the above way does mean that all one gets back is HTTP_BAD_REQUEST, but this might at least be better than HTTP_INTERNAL_SERVER_ERROR as it indicates that it is a problem with the request rather than the server. But then, if the HTML page with the bad form post was on the server, one might say it is a server error anyway as the user didn't formulate the request, your HTML page did and thus it is your fault and not the users. If this is the case, no point having the check at all. One other thought I had was that the above forms processing code in total could be a fallback in some way. In other words, only done if the code file containing the content handler didn't provide some form of form handler or validation routine. Think about how mod_python already has the concept of different phases of processing. At the moment the one just before the content handler being executed is defined by PythonFixupHandler. What could perhaps be done with Vampire is between the fixup handler and the content handler introduce a new phase of processing for dealing with forms. This would be realised by the same code file as contains the content handler defining a formhandler() method. This new method could be passed the req object as with other handlers and its job would be to use whatever mechanism it wants to validate the form data and then put it into an appropriate way for supplying as arguments to the actual content handler. These arguments would be passed back to Vampire for use within the req object. Thus, where Vampire finally calls: return apply(function,(req,),args) the "args" would be what the formhandler() had prepared and passed back through req. When the formhandler() doesn't like the data, can generate appropriate error respones if required and for example give back a list of missing fields in a page if desired. Hmmm, maybe this is a silly idea. The only thing it gives you, is that if you want to have form parameters passed as arguments to the content handler, is a way of generating a better error response if any are missing in the form. All the other things can still be done as an initial phase of the content handler anyway. Based on other emails, it seems that if you want a robust forms processing system, you are better off just accepting a req object and then use some sort of form validation suite. Thus, any support for automatically decoding form parameters in the server-framework (as someone else called it), is redundant really. I guess that if you really wanted all the form parameters to appear as normal variables within the scope of the content handler, rather than doing it as function arguments, the forms validation code could be passed locals() and it could stick them straight into the scope of the function. Ie., def validator(req,locals): # ... locals["user"] = req.form["user'] def handler(req): validator(req,locals()) if user = "...": ... Anyone do anything like this in their form validator, or is it simpler to still access it out of the form? Anyway, people are probably getting sick of my rambling by now. I think I will just add the kwargs support I need to add for now and leave it be for a while. There are so many ways of still doing this within the context of the content handler that any special support isn't really required. Any use of form values passed as actual arguments to a content handler seems only to be useful for simple stuff anyway. -- Graham Dumpleton (grahamd at dscpl.com.au)
|