- From: Smith, Kevin, VF-Group <Kevin.Smith@vodafone.com>
- Date: Fri, 15 Feb 2008 14:44:00 +0100
- To: "Jeremy Carroll" <jjc@hpl.hp.com>, <public-powderwg@w3.org>
- Cc: <public-i18n-core@w3.org>
Hi all, As per Jeremy's suggestion, I believe that the POWDER WG should follow I18N guidelines on referencing IRIs within a POWDER document. The goal is that a description can be assigned to a URI/IRI scope, without risking ambiguity. It appears that homographs in IRIs[1] may introduce such ambiguity. I suggest we specify that the POWDER author uses the UTF-8 encoded IRI, and that it is the responsibility of the POWDER processor to normalise the resource IRI so that it can be determined if it falls withing the scope of the resource set. Cheers, Kevin [1] http://en.wikipedia.org/wiki/IDN_homograph_attack#Homographs_in_internat ionalized_domain_names -----Original Message----- From: public-powderwg-request@w3.org [mailto:public-powderwg-request@w3.org] On Behalf Of Jeremy Carroll Sent: 14 February 2008 17:25 To: public-powderwg@w3.org Cc: public-i18n-core@w3.org Subject: Re:IRI/URI Canonicalization does not address IRIs with IDNs Eric said: [[ I've never implemented Unicode normalization, but I expect it's not trivial. ]] You can use a third party library, e.g. IBM's icu library. It is then quite straightforward. The icu4j library is fairly large, and I believe larger than needed for this problem (since it solves most other I18N problems too). But, I don't think this is really a problem in practice. I would urge the group to specify the right thing, whatever that is, and not be too concerned about the detail here. I believe that fairly soon most Web addresses will have IDNs. Jeremy
Received on Friday, 15 February 2008 13:46:34 UTC