Gustavo Córdova Avila
gustavo.cordova at q-voz.com
Sat Apr 30 09:55:46 EDT 2005
Feghhi, Jalil wrote: >1. I download a page in PSP using urllib and now want to convert and >keep it as utf-8? What calls should I make to convert the encoding of >the page to utf8? > Do you know the encoding of you received page? In that case, something like: >>> u = urllib.urlopen(some_url) >>> document = u.read() >>> u.close() >>> unicode_document = unicode(document, "known-encoding") >2. Is this a good approach? Can I keep any pages in any languages in >this way and return them when requested in utf-8? > I wouldn't know, unless you told us *why* you're decoding. As for "Can I keep...", of course you can, no need to ask for permission ;-) Seriously though, if you're keeping an archive of downloaded pages for some reason, converting them to utf-8 before storage is a good idea, that way all your stored pages have a uniform encoding, no need to do the guesswork/conversions/etc every time you access them. >By the way, I have set my default encoding in Python to utf8. > *Excellent* >Thanks, > >-Jalil > You're welcome. :-) -gca -------------- next part -------------- A non-text attachment was scrubbed... Name: gustavo.cordova.vcf Type: text/x-vcard Size: 196 bytes Desc: not available Url : http://mm_cfg_has_not_been_edited_to_set_host_domains/pipermail/mod_python/attachments/20050430/b001f25c/gustavo.cordova.vcf
|