- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 12 Feb 2008 09:47:18 +0900
- To: Eric Prud'hommeaux <eric@w3.org>
- CC: public-powderwg@w3.org, public-i18n-core@w3.org
Hi Eric (putting i18n core into the loop), Eric Prud'hommeaux wrote: > http://www.w3.org/2007/powder/Group/powder-grouping/20080128.html#canon > does not include IDN example or rules. > there is no need for an IDN example or rule. IRI vs. URI, and IRI>URI conversion (percent escaping) are a step, which is independent of preprocessing necessary for domain name resolution. See also the processing described at http://www.w3.org/International/articles/idn-and-iri/#idn > An example (working) IDN IRI: > http://www.bravå.nu/ > The IDN is punycoded when the IRI is expressed as a URI: > http://www.xn--brav-toa.nu/ > > == homonyms == > å can be written either Ue5 or 'a' + U30a (COMBINING RING ABOVE). > This results in a different punycoded IDN. the punycode is only "seen" by the domain name server which uses it for domain name resolution. There is no need to use it for *IRI/URI* Canonicalization. > Unicode gives *some* > c14n (or folding) rules, but not all, and they are not cheap to > implement. > > == fixing == > This should probably be addressed in an update of mnot's URISpace Note > http://www.w3.org/TR/urispace > > I recommend inserting in 2.1.3.3 Punycode (or maybe IDN) Conversion: > > • Internationalized Domain Names (IDNs) are converted from their > punycode form to Unicode code points. > where does this happen? Note that in IDNA version 2003, roundtripping Unicode > punycode < Unicode is not possible, since during the step Unicode > punycode, non-reversible mapping (e.g. Eszett > ss) are made. But as said above, I think this is out of scope for IRI/URI canonicalization. Felix
Received on Tuesday, 12 February 2008 00:48:43 UTC