- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Wed, 2 Sep 2009 14:51:30 -0700
- To: Larry Masinter <masinter@adobe.com>
- Cc: "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
On Sep 2, 2009, at 10:11 AM, Larry Masinter wrote: > Systems accepting IRIs MAY convert the ireg-name component of an > IRI > as follows (before step 2 above) for schemes known to use domain > names in ireg-name, if the scheme definition does not allow > percent- > encoding for ireg-name I don't think that is relevant now. Schemes do not have the ability to prevent a user from using pct-encoded triplets -- either they don't occur in that part of the reference (and the requirement does not apply) or they do occur in the reference and the application has to find some reasonable thing to do in that situation. Near as I can tell, the only reasonable thing to do is to treat the triplet as a pct-encoded octet even if the scheme does not allow it, since almost all schemes were defined before IDNA existed. Authors started typing/pasting non-ASCII hostnames after that, regardless of the scheme specs. I think we should specify that pct-encoding is always decoded before use of a component in resolution, and further that registered names might be Unicode and that the processor is responsible for conversion to IDNA punycode, if necessary, for the first lookup, and then resort to sending the raw Unicode string (if their name resolver supports that in the API) in the next lookup if the first one failed. This allows IDNA to have precedence (to avoid some localized masking of domains) and yet still works for non-Internet hostname lookup. ....Roy
Received on Wednesday, 2 September 2009 21:51:48 UTC