Roger Binns
rogerb at rogerbinns.com
Mon May 14 17:45:51 EDT 2007
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Colin Bean wrote: >> Sort of... I'd consider encoding the entire path a different solution >> to escaping specific problem characters (and you make this distinction >> below). Base64 encoding would also handle more than just the /./ and >> /../ problem cases, it would handle any other url-unfriendly >> characters that appear in your book titles ('#' and foreign language >> characters come to mind, although you could still escape / url encode >> those). All those characters are url encoded(*). The foreign characters are dealt with by making the url be utf8 encoded. You do get the raw bytes from the utf8 % encoded and it all works fine (I have extensive test suites). My top tip for that kind of testing is to go to wikipedia.org and copy all the text on the front page to use as your test string. You get a lot of characters, right to left text and codepoints above 0xffff all at once. (*) The URL encoding of / or . doesn't help as Apache unescapes it before normalization etc. Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGSNiPmOOfHg372QQRAuZyAJ9tgzgRYxVNvU5JKI6EN5K4HNFfHwCfXLm/ 9TijtzxrFguex9LeexDK25k= =OTnc -----END PGP SIGNATURE-----
|