[SPAM] Re: [mod_python] Sanitizing user input... but not totally.

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Wed Nov 23 17:26:15 EST 2005


I have read somewhere the string.maketrans() is most effective.
With the following I:
1) convert 'U' or 'u' into 'T' or 't'
2) while also zap any numbers and space

_sequence = _sequence.translate(string.maketrans('Uu', 'Tt'),' 0123456789')

benchmark yourself.
Martin


Anthony L. wrote:
> Thanks for the suggestions. I am looking into the xml.sax.saxutils 
> modules Jim mentioned as well as Python's HTML parsing module.
> 
> I recalled an earlier technique I used for blocking undesirable 
> characters using JavaScript, and came up with the following:
> 
>     x = 'U at s%er#_N$a^m!e%-<'
>     
>     set = '0123456789 
> _-abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.'
> 
>     print ''.join( [c for c in x if c in set] )
> 
> This is very simple, easily extensible by customizing the set variable. 
> It allows me to keep in all my allowable characters including 
> underscores and dashes. If I wanted to use this to sanitize a user input 
> for an email address, I could add the period and amphora to the set 
> variable. This feels pretty fast, though I can't quantify speed, and I'm 
> tempted to use it everywhere I don't need to be conscious of markup code 
> and SQL injection. But is this a good Pythonic way of doing things?
> 
> Also, unicode. I'd like to allow input using characters from the Latin-1 
> set, but I can't figure out how. I did the following:
> 
>     x = u'Üsernåmë'
> 
>     set = u'åëüÜabcdefghijklmnopqrstuvwxyz'
>     
>     print ''.join( [c for c in x if c in set] )
> 
> But it's not enough and will return a UnicodeEncodeError. Of course, 
> it's probably not even proper to include the actual decoded characters 
> within the variable. Can someone point me to an exhaustive source for 
> unicode on python that has many examples?
> 
> Anthony
> 
> 
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
> 
> 

-- 
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs


More information about the Mod_python mailing list