|
Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Wed Nov 23 17:26:15 EST 2005
I have read somewhere the string.maketrans() is most effective.
With the following I:
1) convert 'U' or 'u' into 'T' or 't'
2) while also zap any numbers and space
_sequence = _sequence.translate(string.maketrans('Uu', 'Tt'),' 0123456789')
benchmark yourself.
Martin
Anthony L. wrote:
> Thanks for the suggestions. I am looking into the xml.sax.saxutils
> modules Jim mentioned as well as Python's HTML parsing module.
>
> I recalled an earlier technique I used for blocking undesirable
> characters using JavaScript, and came up with the following:
>
> x = 'U at s%er#_N$a^m!e%-<'
>
> set = '0123456789
> _-abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.'
>
> print ''.join( [c for c in x if c in set] )
>
> This is very simple, easily extensible by customizing the set variable.
> It allows me to keep in all my allowable characters including
> underscores and dashes. If I wanted to use this to sanitize a user input
> for an email address, I could add the period and amphora to the set
> variable. This feels pretty fast, though I can't quantify speed, and I'm
> tempted to use it everywhere I don't need to be conscious of markup code
> and SQL injection. But is this a good Pythonic way of doing things?
>
> Also, unicode. I'd like to allow input using characters from the Latin-1
> set, but I can't figure out how. I did the following:
>
> x = u'Üsernåmë'
>
> set = u'åëüÜabcdefghijklmnopqrstuvwxyz'
>
> print ''.join( [c for c in x if c in set] )
>
> But it's not enough and will return a UnicodeEncodeError. Of course,
> it's probably not even proper to include the actual decoded characters
> within the variable. Can someone point me to an exhaustive source for
> unicode on python that has many examples?
>
> Anthony
>
>
> _______________________________________________
> Mod_python mailing list
> Mod_python at modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
>
>
--
Martin Mokrejs
Email: 'bW9rcmVqc21Acmlib3NvbWUubmF0dXIuY3VuaS5jeg==\n'.decode('base64')
GPG key is at http://www.natur.cuni.cz/~mmokrejs
|