[mod_python] Re: [patch] vampire 1.1 remove unknown query variables

Graham Dumpleton grahamd at dscpl.com.au
Wed Oct 20 03:54:40 EDT 2004


On Oct 19 13:56, Johannes Erdfelt <johannes at erdfelt.com> wrote:
>
> Subject: [patch] vampire 1.1 remove unknown query variables
>
> I've been using vampire the last couple of days and I like it, but it
> has some problems some code I originally wrote didn't have.
> 
> This patch fixes a problem where unknown query variables (form
> variables) could cause an internal server error.
> 
> Generally, this shouldn't happen and on a well designed site, it's the
> users error for causing this to occur, but it causes some undue alarm
> when looking at the logs, so I wrote up this patch to remove unknown
> query variables before applying it to the called handler.
> 
> It also prints out an error message if variables are required by the
> handler, but aren't given by the client (it doesn't have a default)
> 
> It applies to vampire 1.1
> 
> JE
> 
> diff -ur vampire-1.1-20041009.orig/packages/vampire/apache.py vampire-1.1-20041009/packages/
vampire/apache.py
> --- vampire-1.1-20041009.orig/packages/vampire/apache.py
	2004-10-08 18:31:50.000000000 -0700
> +++ vampire-1.1-20041009/packages/vampire/apache.py	2004-10-19 13:55:48.000000000 -0700
> @@ -171,4 +171,40 @@
>  
>    # Execute the content handler.
>  
> -  return apply(function,(req,),args)
> +  # Match up the arguments given by the client to the expected arguments
> +  # from the method. We only remove non expected names and don't check for
> +  # expected because the argument may have a default if not set. We use
> +  # exceptions to catch the case where an argument does not have a default.
> +  fc = function.func_code
> +  expected = fc.co_varnames[0:fc.co_argcount]
> + 
> +  # Silently remove any unexpected arguments if we need to
> +  if not fc.co_flags & 0x000C:  # CO_VARARGS | CO_VARKEYWORDS
> +    for name in args.keys():
> +      if name not in expected:
> +        del args[name]
> +
> +  try:
> +    return apply(function,(req,),args)
> +  except TypeError, vars:
> +    missing = []
> +
> +    # Don't worry about the arguments with defaults
> +    argcount = fc.co_argcount
> +    if function.func_defaults:
> +      argcount = argcount - len(function.func_defaults)
> +    # Skip the first argument, which is the req
> +    for name in fc.co_varnames[1:argcount]:
> +      if name not in args:
> +        missing.append(name)
> +
> +    if not len(missing):
> +      raise
> +
> +    # We definately had some missing variables, let's let the user know
> +    req.content_type = "text/plain"
> +    req.status = apache.HTTP_INTERNAL_SERVER_ERROR
> +    req.send_http_header()
> +    req.write("Call is missing these variables: %s\n" % ", ".join(missing))
> +
> +    return apache.OK

After looking at this and playing with it a bit, I am actually sort of inclined
to do the check even before the call of the handler even though it introduces
a little bit more overhead. Ie., something like:

@@ -168,6 +169,34 @@
       if len(args[arg]) == 1:
         args[arg] = args[arg][0]
 
+    fc = function.func_code
+
+    expected = fc.co_varnames[0:fc.co_argcount]
+
+    if not fc.co_flags & 0x08:
+      for name in args.keys():
+       if name not in expected:
+         del args[name]
+
+    # Following identifies missing form parameters
+    # and returns a HTTP_BAD_REQUEST instead of a
+    # HTTP_INTERNAL_SERVER error. This at least gives
+    # a bit of an indication it was because of the
+    # request and not a problem with the server.
+
+    missing = []
+
+    argcount = fc.co_argcount
+    if function.func_defaults:
+      argcount = argcount - len(function.func_defaults)
+
+    for name in fc.co_varnames[1:argcount]:
+      if name not in args:
+       missing.append(name)
+
+    if missing:
+      return apache.HTTP_BAD_REQUEST
+
   # Execute the content handler.
 
   return apply(function,(req,),args)

BTW, the code which drops out form parameters that weren't expected
should probably only look for CO_VARKEYWORDS and not CO_VARARGS
as well.

Doing it the above way does mean that all one gets back is HTTP_BAD_REQUEST,
but this might at least be better than HTTP_INTERNAL_SERVER_ERROR as
it indicates that it is a problem with the request rather than the server. But
then, if the HTML page with the bad form post was on the server, one might
say it is a server error anyway as the user didn't formulate the request, your
HTML page did and thus it is your fault and not the users. If this is the case,
no point having the check at all.

One other thought I had was that the above forms processing code in total
could be a fallback in some way. In other words, only done if the code file
containing the content handler didn't provide some form of form handler 
or validation routine.

Think about how mod_python already has the concept of different phases of
processing. At the moment the one just before the content handler being
executed is defined by PythonFixupHandler.

What could perhaps be done with Vampire is between the fixup handler
and the content handler introduce a new phase of processing for dealing
with forms. This would be realised by the same code file as contains the
content handler defining a formhandler() method.

This new method could be passed the req object as with other handlers and
its job would be to use whatever mechanism it wants to validate the form
data and then put it into an appropriate way for supplying as arguments
to the actual content handler. These arguments would be passed back to
Vampire for use within the req object.

Thus, where Vampire finally calls:

  return apply(function,(req,),args)

the "args" would be what the formhandler() had prepared and passed back
through req.

When the formhandler() doesn't like the data, can generate appropriate error
respones if required and for example give back a list of missing fields in a
page if desired.

Hmmm, maybe this is a silly idea. The only thing it gives you, is that if you
want to have form parameters passed as arguments to the content handler,
is a way of generating a better error response if any are missing in the form.
All the other things can still be done as an initial phase of the content handler
anyway.

Based on other emails, it seems that if you want a robust forms processing
system, you are better off just accepting a req object and then use some
sort of form validation suite. Thus, any support for automatically decoding
form parameters in the server-framework (as someone else called it), is
redundant really.

I guess that if you really wanted all the form parameters to appear as normal
variables within the scope of the content handler, rather than doing it as
function arguments, the forms validation code could be passed locals() and
it could stick them straight into the scope of the function. Ie.,

  def validator(req,locals):
    # ...

    locals["user"] = req.form["user']

  def handler(req):
    validator(req,locals())
    if user = "...":
      ...

Anyone do anything like this in their form validator, or is it simpler to still access
it out of the form?

Anyway, people are probably getting sick of my rambling by now. I think I will
just add the kwargs support I need to add for now and leave it be for a while.
There are so many ways of still doing this within the context of the content
handler that any special support isn't really required. Any use of form values
passed as actual arguments to a content handler seems only to be useful for
simple stuff anyway.

--
Graham Dumpleton (grahamd at dscpl.com.au)


More information about the Mod_python mailing list