Gustavo Córdova Avila
gustavo.cordova at q-voz.com
Thu Feb 16 12:40:29 EST 2006
> Dan Eloff wrote: Gustavo, here's an example. Suppose some code > enforces a maximum length on a string. If it's counting on a default > encoding of 1 byte per char, and does something like len(s) <= 15. For > ascii or iso-8859-1 this would work. Or the code might use indices or > slices (and a lot of code does!) If suddenly you have utf-8 encoded > chinese, your string is going to triple in length, and those functions > will have unpredictable behaviour. You could think of any number of > scenarios, even in the python library. I just wouldn't feel > confortable about changing the default encoding, you never know where > it will come back to haunt you. What's so hard about using unicode > strings in your program and then encoding when you send output somewhere? Yes, it seems like a valid problem case, but it can be sidestepped very easily by simple working with unicode objects until you're ready to output your data. PSP (and 'str') are going to render your unicodes only when it's absolutely necesary, and not before (at least that's what my template classes do), so if you use: <%= myname[:32] %> the unicode object being sliced --myname-- is going to return another unicode object representing the slice, which in turn is going to be converted to 'str' using the default encoding. There's a lot of ways to plan your apps to not have the trouble you mention, and my motto has always been "Don't think 'why not', think 'how to'." Good luck :-) -gus
|