Jim Gallacher
jpg at jgassociates.ca
Tue Nov 22 13:00:26 EST 2005
Anthony L. wrote: > Does mod_python or even Python have a function for sanitizing user > input? I wrote the following code which does the trick, but it only > allows alphanumeric characters: > > s = '#/A%(.n=t$h^&o at n*":;<.?/-_+=!y)' > return filter(lambda c:c.isalnum(), s) > > This does a great job of cleaning up the string, but I would like to > extend it to allow certain extra characters as I desire. For example, > in certain inputs I want to allow underscores, tildes, periods, and > spaces. > > Am I better off using regular expressions? It's my understanding that > the above code is better performance and load wise versus regular > expressions where there are many simultaneous user inputs. > > I'd like to remain positive by building a whitelist of acceptable > characters rather than to blacklist every possible bad character, but I > would like to code unicode support as well, so I am nervous about > scanning each and every character in a user input to see if it is not > on the allowable list... seems inefficient. Have you taken a look at escape in the xml.sax.saxutils python module? Jim
|