Re: query on iregname conversion

It is indeed strange to have "creating IRIs" in the middle of the
section on mapping IRIs to URIs, but IDNA2003 implementations should
never call ToASCII with AllowUnassigned set to TRUE because they have
no way of knowing whether the unassigned character might become an
upper-case character in the future (which would have to be mapped to
lower-case then).

Erik

On Wed, Sep 2, 2009 at 10:32 AM, Thomas Roessler<tlr@w3.org> wrote:
> On 2 Sep 2009, at 19:11, Larry Masinter wrote:
>
>> I'm still working on a draft that turns the MAY into a MUST for
>> ireg-name processing; it winds up rewriting a lot of the
>> document because it puts parsing before percent-encoding.
>>
>> I'd rather wait to discuss this until I have a draft ready
>> (had hoped to finish yesterday).
>>
>> One section I've stumbled on is:
>>
>>
>>  Systems accepting IRIs MAY convert the ireg-name component of an IRI
>>  as follows (before step 2 above) for schemes known to use domain
>>  names in ireg-name, if the scheme definition does not allow percent-
>>  encoding for ireg-name: Replace the ireg-name part of the IRI by the
>>  part converted using the ToASCII operation specified in Section 4.1
>>  of [RFC3490] on each dot-separated label, and by using U+002E (FULL
>>  STOP) as a label separator, with the flag UseSTD3ASCIIRules set to
>>  TRUE,
>
>
> Another point related to yours: UseSTD3ASCIIRules should be FALSE here.
>  Those are rules on the *registration* of domain names, and I don't see what
> they have to do in a specification that effectively deals with resolution.
>
> From a quick check using "_test0_α.does-not-exist.org" as a test case, it
> seems like at least the latest Safari and Firefox don't set that flag when
> trying to resolve an IRI reference.
>
> I did some archeology on the topic in March; the genesis of
> UseSTD3ASCIIRules being TRUE goes back to this note from Martin:
>   http://www.imc.org/idn/mail-archive/msg07277.html
>
> ... which seems to be mistaken about the intent of some of the text in the
> original URI spec.
>
>>
>> and with the flag AllowUnassigned set to FALSE for creating
>>   IRIs and set to TRUE otherwise.  The ToASCII operation may fail, but
>>  this would mean that the IRI cannot be resolved.  This conversion
>>  SHOULD be used when the goal is to maximize interoperability with
>>  legacy URI resolvers.  For example, the IRI
>>  "http://r&#xE9;sum&#xE9;.example.org"
>>  may be converted to
>>  "http://xn--rsum-bpad.example.org"
>>  instead of
>>  "http://r%C3%A9sum%C3%A9.example.org".
>>
>>
>> Can someone explain the AllowedUnassigned set to FALSE for "creating
>> IRIs"?  This is in the middle of the algorithm for converting IRIs
>> (which is turning into converting 'parsed IRI components' into
>> 'parsed URI components'), but what is the applicability of
>> 'creating IRIs' when doing this mapping anyway?
>
> I'd think none, i.e., AllowUnassigned should be TRUE in this spot, for the
> very reason that you mention.
>
>
>

Received on Wednesday, 2 September 2009 18:03:17 UTC