W3C home > Mailing lists > Public > public-powderwg@w3.org > February 2008

§2.1.3 IRI/URI Canonicalization does not address IRIs with IDNs

From: Eric Prud'hommeaux <eric@w3.org>
Date: Mon, 11 Feb 2008 11:43:58 -0500
To: public-powderwg@w3.org
Cc: Felix Sasaki <fsasaki@w3.org>
Message-ID: <20080211164358.GD16996@w3.org>
does not include IDN example or rules.

An example (working) IDN IRI:
The IDN is punycoded when the IRI is expressed as a URI:

== homonyms ==
å can be written either Ue5 or 'a' + U30a (COMBINING RING ABOVE).
This results in a different punycoded IDN. Unicode gives *some*
c14n (or folding) rules, but not all, and they are not cheap to

== fixing ==
This should probably be addressed in an update of mnot's URISpace Note

I recommend inserting in Punycode (or maybe IDN) Conversion:

  • Internationalized Domain Names (IDNs) are converted from their
    punycode form to Unicode code points.

  Note: None of the normalization proceedures described in the Unicode
  specification are performed during POWDER IRI canonicalization.

== dissenting opinion ==
http://unicode.org/faq/normalization.html#2 suggests NFKC normalization
so perhaps you want POWDER apps to do that. You help the folks who hand-
enter a URL and happen to write it in a different form. I've never
implemented Unicode normalization, but I expect it's not trivial.

Happy trade-offs

office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509

Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Monday, 11 February 2008 16:44:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:06:03 UTC