- From: olivier Thereaux <ot@w3.org>
- Date: Fri, 18 Jan 2008 15:27:22 +0900
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- Cc: Tools dev list <public-qa-dev@w3.org>
Hi Martin, I was doing some tests with IRIs in perl and your name kept cropping up in documentation, so I was wondering if you could answer some of my doubts. Do you know what the state of adoption of IRIs (and in particular IDNs) in perl? I have seen some IDN-related modules (e.g [1]) being released, but it seems that the top obstacle to nicely handling IRIs in perl is that the URI module [2] is not IRI-friendly. As my little test script (attached, but not worth much) showed the URI constructor ignores and trashes all non-ascii characters in the host [3]. [1] Net::IDN::Encode [2] http://search.cpan.org/~gaas/URI-1.35/URI.pm [3] http://search.cpan.org/~gaas/URI-1.35/URI.pm#CONSTRUCTORS I was hoping I'd be able to 1) construct the URI object and THEN 2) prepname and encode to punycode the hostname with something like: $uri->host( domain_to_ascii($uri->host) ); but that won't work because by that time all the non-ascii characters in the hostname have already been trashed by URI::Escape. The other solution would be to first encode into punycode, then construct the URI object, but that means reinventing the wheel and parsing the URI by hand (to get the host part) first. So, that's not satisfying. What is surprising me is that apparently there is nothing in the tracker for this module mentioning IDNs and punycode. Maybe noone has yet suggested to the module maintainers that instead of trashing all non-ascii chars, they should be attempting a punycode conversion? Do you recall any such discussion? Have you already experimented in this area? Thanks. -- olivier
Received on Friday, 18 January 2008 06:27:41 UTC