Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Wed Nov 23 17:26:15 EST 2005
I have read somewhere the string.maketrans() is most effective. With the following I: 1) convert 'U' or 'u' into 'T' or 't' 2) while also zap any numbers and space _sequence = _sequence.translate(string.maketrans('Uu', 'Tt'),' 0123456789') benchmark yourself. Martin Anthony L. wrote: > Thanks for the suggestions. I am looking into the xml.sax.saxutils > modules Jim mentioned as well as Python's HTML parsing module. > > I recalled an earlier technique I used for blocking undesirable > characters using JavaScript, and came up with the following: > > x = 'U at s%er#_N$a^m!e%-<' > > set = '0123456789 > _-abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.' > > print ''.join( [c for c in x if c in set] ) > > This is very simple, easily extensible by customizing the set variable. > It allows me to keep in all my allowable characters including > underscores and dashes. If I wanted to use this to sanitize a user input > for an email address, I could add the period and amphora to the set > variable. This feels pretty fast, though I can't quantify speed, and I'm > tempted to use it everywhere I don't need to be conscious of markup code > and SQL injection. But is this a good Pythonic way of doing things? > > Also, unicode. I'd like to allow input using characters from the Latin-1 > set, but I can't figure out how. I did the following: > > x = u'Üsernåmë' > > set = u'åëüÜabcdefghijklmnopqrstuvwxyz' > > print ''.join( [c for c in x if c in set] ) > > But it's not enough and will return a UnicodeEncodeError. Of course, > it's probably not even proper to include the actual decoded characters > within the variable. Can someone point me to an exhaustive source for > unicode on python that has many examples? > > Anthony > > > _______________________________________________ > Mod_python mailing list > Mod_python at modpython.org > http://mailman.modpython.org/mailman/listinfo/mod_python > > -- Martin Mokrejs Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64') GPG key is at http://www.natur.cuni.cz/~mmokrejs
|