[mod_python] Sanitizing user input... but not totally.

Jim Gallacher jpg at jgassociates.ca
Tue Nov 22 13:00:26 EST 2005

Anthony L. wrote:
> Does mod_python or even Python have a function for sanitizing user  
> input? I wrote the following code which does the trick, but it only  
> allows alphanumeric characters:
>     s = '#/A%(.n=t$h^&o at n*":;<.?/-_+=!y)'
>         return filter(lambda c:c.isalnum(), s)
> This does a great job of cleaning up the string, but I would like to  
> extend it to allow certain extra characters as I desire. For example,  
> in certain inputs I want to allow underscores, tildes, periods, and  
> spaces.
> Am I better off using regular expressions? It's my understanding that  
> the above code is better performance and load wise versus regular  
> expressions where there are many simultaneous user inputs.
> I'd like to remain positive by building a whitelist of acceptable  
> characters rather than to blacklist every possible bad character, but  I 
> would like to code unicode support as well, so I am nervous about  
> scanning each and every character in a user input to see if it is not  
> on the allowable list... seems inefficient.

Have you taken a look at escape in the xml.sax.saxutils python module?


