W3C home > Mailing lists > Public > public-iri@w3.org > August 2009

IDN handling, please help

From: Larry Masinter <masinter@adobe.com>
Date: Sat, 29 Aug 2009 18:08:50 -0700
To: "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
Message-ID: <8B62A039C620904E92F1233570534C9B0118DB9ABC3A@nambx04.corp.adobe.com>
I'm reading this text over and over again, and I really don't get it. Can someone explain what the distinction is between "scheme definition does not allow percent-encoding for ireg-name, and scheme definition DOES allow percent-encoding for ireg-name"?  What schemes allow percent-encoding for ireg-name, for example?

Not sure what problem this is solving, or why the two algorithms are different, or whether one is just a shortcut in a special case.

=================================================



Systems accepting IRIs MAY convert the ireg-name component of an IRI as follows (before step 2 above) for schemes known to use domain names in ireg-name, if the scheme definition does not allow percent-encoding for ireg-name: Replace the ireg-name part of the IRI by the part converted using the ToASCII operation specified in Section 4.1 of [RFC3490] (Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)," March 2003.)<http://larry.masinter.net/draft-duerst-iri-bis.html#RFC3490> on each dot-separated label, and by using U+002E (FULL STOP) as a label separator, with the flag UseSTD3ASCIIRules set to TRUE, and with the flag AllowUnassigned set to FALSE for creating IRIs and set to TRUE otherwise. The ToASCII operation may fail, but this would mean that the IRI cannot be resolved. This conversion SHOULD be used when the goal is to maximize interoperability with legacy URI resolvers. For example, the IRI
"http://r&#xE9;sum&#xE9;.example.org"
may be converted to
"http://xn--rsum-bpad.example.org"
instead of
"http://r%C3%A9sum%C3%A9.example.org".

An IRI with a scheme that is known to use domain names in ireg-name, but where the scheme definition does not allow percent-encoding for ireg-name, meets scheme-specific restrictions if either the straightforward conversion or the conversion using the ToASCII operation on ireg-name result in an URI that meets the scheme-specific restrictions.


--
http://larry.masinter.net
Received on Sunday, 30 August 2009 01:09:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:55 GMT