[mod_python] Sanitizing user input... but not totally.

Tue Nov 22 12:59:54 EST 2005

On 11/22/05, Anthony L. <anthony at ataribaby.org> wrote:
> Does mod_python or even Python have a function for sanitizing user
> input?

Santizing is usually very application specific.  Both in what
patterns of characters are acceptible, as well as what to do
if the rules aren't met (simply remove the offending characters,
replace them with something else, show an error page, etc.)

So it's usually best for each application to provide it's own
functions to do this.

However, you should be aware of some common mechanisms
in the libraries you may import.  For instance,

In myghty:
  m.apply_escapes()

In cgi:
  cgi.escape()

In urllib, urllib2:
  urllib.quote(), urllib2.quote_plus()

In DBI (most python database interfaces):
  db.escape()   % use for embedding values in SQL statements

For Unix shell commands (if you're invoking external apps):
  shlex.escape()

Also if you're wanting to make sure filenames are "safe" you
may also want to apply this test:
  import os
  def is_filename_safe( f ):
     if not f or f.startswith('.') or f.find(os.path.sep)>=0:
        return false
     return true

> Am I better off using regular expressions? It's my understanding that
> the above code is better performance and load wise versus regular
> expressions where there are many simultaneous user inputs.

Performance depends on what kind of filtering you are doing.
If using regular expressions though, you should try to use
compiled regular expression objects.  That eliminates most of
the expensive part of using re's for the second and subsequent
use.

Also look at the standard function string.translate().  It may do
much of what you want.

--
Deron Meranda