- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Mon, 11 Feb 2008 11:43:58 -0500
- To: public-powderwg@w3.org
- Cc: Felix Sasaki <fsasaki@w3.org>
- Message-ID: <20080211164358.GD16996@w3.org>
http://www.w3.org/2007/powder/Group/powder-grouping/20080128.html#canon
does not include IDN example or rules.
An example (working) IDN IRI:
http://www.bravå.nu/
The IDN is punycoded when the IRI is expressed as a URI:
http://www.xn--brav-toa.nu/
== homonyms ==
å can be written either Ue5 or 'a' + U30a (COMBINING RING ABOVE).
This results in a different punycoded IDN. Unicode gives *some*
c14n (or folding) rules, but not all, and they are not cheap to
implement.
== fixing ==
This should probably be addressed in an update of mnot's URISpace Note
http://www.w3.org/TR/urispace
I recommend inserting in 2.1.3.3 Punycode (or maybe IDN) Conversion:
• Internationalized Domain Names (IDNs) are converted from their
punycode form to Unicode code points.
Note: None of the normalization proceedures described in the Unicode
specification are performed during POWDER IRI canonicalization.
== dissenting opinion ==
http://unicode.org/faq/normalization.html#2 suggests NFKC normalization
so perhaps you want POWDER apps to do that. You help the folks who hand-
enter a URL and happen to write it in a different form. I've never
implemented Unicode normalization, but I expect it's not trivial.
Happy trade-offs
--
-eric
office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509
(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Monday, 11 February 2008 16:44:16 UTC