- From: Thomas Roessler <tlr@w3.org>
- Date: Wed, 2 Sep 2009 20:08:52 +0200
- To: Erik van der Poel <erikv@google.com>
- Cc: Larry Masinter <masinter@adobe.com>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
On 2 Sep 2009, at 20:02, Erik van der Poel wrote: > It is indeed strange to have "creating IRIs" in the middle of the > section on mapping IRIs to URIs, but IDNA2003 implementations should > never call ToASCII with AllowUnassigned set to TRUE because they have > no way of knowing whether the unassigned character might become an > upper-case character in the future (which would have to be mapped to > lower-case then). What would the guidance be for IDNA2009 implementations? > Erik > > On Wed, Sep 2, 2009 at 10:32 AM, Thomas Roessler<tlr@w3.org> wrote: >> On 2 Sep 2009, at 19:11, Larry Masinter wrote: >> >>> I'm still working on a draft that turns the MAY into a MUST for >>> ireg-name processing; it winds up rewriting a lot of the >>> document because it puts parsing before percent-encoding. >>> >>> I'd rather wait to discuss this until I have a draft ready >>> (had hoped to finish yesterday). >>> >>> One section I've stumbled on is: >>> >>> >>> Systems accepting IRIs MAY convert the ireg-name component of an >>> IRI >>> as follows (before step 2 above) for schemes known to use domain >>> names in ireg-name, if the scheme definition does not allow >>> percent- >>> encoding for ireg-name: Replace the ireg-name part of the IRI by >>> the >>> part converted using the ToASCII operation specified in Section 4.1 >>> of [RFC3490] on each dot-separated label, and by using U+002E (FULL >>> STOP) as a label separator, with the flag UseSTD3ASCIIRules set to >>> TRUE, >> >> >> Another point related to yours: UseSTD3ASCIIRules should be FALSE >> here. >> Those are rules on the *registration* of domain names, and I don't >> see what >> they have to do in a specification that effectively deals with >> resolution. >> >> From a quick check using "_test0_α.does-not-exist.org" as a test >> case, it >> seems like at least the latest Safari and Firefox don't set that >> flag when >> trying to resolve an IRI reference. >> >> I did some archeology on the topic in March; the genesis of >> UseSTD3ASCIIRules being TRUE goes back to this note from Martin: >> http://www.imc.org/idn/mail-archive/msg07277.html >> >> ... which seems to be mistaken about the intent of some of the text >> in the >> original URI spec. >> >>> >>> and with the flag AllowUnassigned set to FALSE for creating >>> IRIs and set to TRUE otherwise. The ToASCII operation may fail, >>> but >>> this would mean that the IRI cannot be resolved. This conversion >>> SHOULD be used when the goal is to maximize interoperability with >>> legacy URI resolvers. For example, the IRI >>> "http://résumé.example.org" >>> may be converted to >>> "http://xn--rsum-bpad.example.org" >>> instead of >>> "http://r%C3%A9sum%C3%A9.example.org". >>> >>> >>> Can someone explain the AllowedUnassigned set to FALSE for "creating >>> IRIs"? This is in the middle of the algorithm for converting IRIs >>> (which is turning into converting 'parsed IRI components' into >>> 'parsed URI components'), but what is the applicability of >>> 'creating IRIs' when doing this mapping anyway? >> >> I'd think none, i.e., AllowUnassigned should be TRUE in this spot, >> for the >> very reason that you mention. >> >> >> >
Received on Wednesday, 2 September 2009 18:09:05 UTC