[mod_python] Sanitizing user input... but not totally.

Anthony L. anthony at ataribaby.org
Tue Nov 22 12:30:44 EST 2005


Does mod_python or even Python have a function for sanitizing user  
input? I wrote the following code which does the trick, but it only  
allows alphanumeric characters:

	s = '#/A%(.n=t$h^&o at n*":;<.?/-_+=!y)'
         return filter(lambda c:c.isalnum(), s)

This does a great job of cleaning up the string, but I would like to  
extend it to allow certain extra characters as I desire. For example,  
in certain inputs I want to allow underscores, tildes, periods, and  
spaces.

Am I better off using regular expressions? It's my understanding that  
the above code is better performance and load wise versus regular  
expressions where there are many simultaneous user inputs.

I'd like to remain positive by building a whitelist of acceptable  
characters rather than to blacklist every possible bad character, but  
I would like to code unicode support as well, so I am nervous about  
scanning each and every character in a user input to see if it is not  
on the allowable list... seems inefficient.

Anthony


More information about the Mod_python mailing list