[mod_python] req.write and unicode

Bart scarfboy at gmail.com
Wed Sep 13 20:01:12 EDT 2006


Hi all,

Apologies if this has been discussed before.
(the mod_python archives are a bit hard to search decisively)


I wonder whether req.write() really has to bork on unicode strings.
We have lovely no-worries pythonic unicode handling in python,
but in m_p you have to either .encode('utf8') on every single req.write,
or less redundantly with code perhaps like:

def handler():
 ret=[]
 # code
 ret.append('I am text')
 ret.append(u'I am \u2222 other text')
 # more code
 req.write( ''.join(ret).encode('utf8') )
 return apache.OK

(...which is of course sort of annoying if you do do occasionally
 want the write-and-flush ability, e.g. when the process is slow
 and you want immediate feedback)

(I believe this currently fails because the unicode.__str__
 uses the site.py encoding, which you can't always set)


Wouldn't it be relatively trivial to make the write function always
encode (unicode) strings according to a configured encoding?
(with a sane default like utf8, and settable via code, and
 also PythonOptions apache config so that there won't be a
 you-need-to-set-this-in-every-file requirement)



Of course, this isn't a problem with one true solution(tm),
because you still need to set content_type (now perhaps
with a default like "text/html; charset=utf-8") or something
still makes it work, but the suggestion still seems like
handier default behaviour.


This probably deserves some documentation attention anyway,
partly because the more intuitive-from-python "charset=utf8"
seems to be wrong (should be utf-8)


Poker of suggestions,
--Bart


More information about the Mod_python mailing list