[mod_python] Encoding Question

Gustavo Córdova Avila gustavo.cordova at q-voz.com
Sat Apr 30 09:55:46 EDT 2005

Feghhi, Jalil wrote:

>1. I download a page in PSP using urllib and now want to convert and
>keep it as utf-8? What calls should I make to convert the encoding of
>the page to utf8?
Do you know the encoding of you received page?  In that case, something 

 >>> u = urllib.urlopen(some_url)
 >>> document = u.read()
 >>> u.close()
 >>> unicode_document = unicode(document, "known-encoding")

>2. Is this a good approach? Can I keep any pages in any languages in
>this way and return them when requested in utf-8?
I wouldn't know, unless you told us *why* you're decoding.  As for "Can 
I keep...", of course you can, no need to ask for permission ;-)

Seriously though, if you're keeping an archive of downloaded pages for 
some reason, converting them to utf-8 before storage is a good idea, 
that way all your stored pages have a uniform encoding, no need to do 
the guesswork/conversions/etc every time you access them.

>By the way, I have set my default encoding in Python to utf8.

You're welcome. :-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gustavo.cordova.vcf
Type: text/x-vcard
Size: 196 bytes
Desc: not available
Url : http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20050430/b001f25c/gustavo.cordova.vcf

More information about the Mod_python mailing list