Graham Dumpleton
graham.dumpleton at gmail.com
Sat Jun 23 07:38:36 EDT 2007
Making a guess here, but even though your form tries to require that posted form data is UTF-8 it probably isn't and is instead passing it through as ISO-8859-1 or some other European character set. Am then guessing that that character maps to the special UTF-8 marker character for indicating a multibyte character. Try this in Python: >>> s.decode("iso-8859-1").encode('utf-8') '\xc3\x83' See how it ends up as a multibyte character where the marker is itself \xc3. Also: >>> print s.decode("iso-8859-1").encode('utf-8') Ã Is that the character you are expecting to see. I'm not in any way an expert on Unicode though, so could be quite wrong. Graham On 21/06/07, Anastasios Hatzis <ah at hatzis.de> wrote: > Hi, > > I'm struggling with UnicodeDecodeError when trying to append > util.FieldStorage(req).Field.value with umlaut into a variable of type > unicode. > > <SNIP> > <% > # file: test.psp > > from mod_python import Session, util > req.assbackwards > > req.content_type = 'text/html; charset=UTF-8' > %> > <!--SOME HTML HEAD STUFF HERE--> > > <form action="/test.psp" method="post" enctype="text/plain" > accept-charset="UTF-8"> > <!-- form calls this very same file --> > <input name="name" type="text" size="30" maxlength="50" /> > <input type="submit" value="Submit" /> > </form> > > <% > req.write('<p>Write values to HTML</p>') # Works fine! > store = util.FieldStorage(req) > for param in store.list: > req.write(param.value) > %> > <b><%=param.name%></b>:<%=param.value%>:<%=type(param.value)%><br /> > <% > > req.write('<p>Append all values to the variable msg</p>') > msg = u'' > for param in store.list: > msg += param.name + ': ' + param.value + '\r\n' # UnicodeDecodeError! > %> > <!--SOME HTML FOOT STUFF HERE--> > > </SNIP> > > > So, when calling this page the HTML output for first section is rendered with > umlaut (ä, ü, ...). Value is <type 'str'> ... well, why not, as long as it is > UTF-8... > > But as soon as the script tries to append param.value to the unicode > variable "msg" I'm getting this error: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 26: > ordinal not in range(128) > > > I do not understand why I'm getting this error. How does 'ascii' come into > this? How do I know which encoding is really applied (in param.value and in > msg)? > > I'm blind, hum? > > Anastasios > > _______________________________________________ > Mod_python mailing list > Mod_python at modpython.org > http://mailman.modpython.org/mailman/listinfo/mod_python > > >
|