Re: query on iregname conversion

On 2 Sep 2009, at 20:02, Erik van der Poel wrote:

> It is indeed strange to have "creating IRIs" in the middle of the
> section on mapping IRIs to URIs, but IDNA2003 implementations should
> never call ToASCII with AllowUnassigned set to TRUE because they have
> no way of knowing whether the unassigned character might become an
> upper-case character in the future (which would have to be mapped to
> lower-case then).

What would the guidance be for IDNA2009 implementations?

> Erik
>
> On Wed, Sep 2, 2009 at 10:32 AM, Thomas Roessler<tlr@w3.org> wrote:
>> On 2 Sep 2009, at 19:11, Larry Masinter wrote:
>>
>>> I'm still working on a draft that turns the MAY into a MUST for
>>> ireg-name processing; it winds up rewriting a lot of the
>>> document because it puts parsing before percent-encoding.
>>>
>>> I'd rather wait to discuss this until I have a draft ready
>>> (had hoped to finish yesterday).
>>>
>>> One section I've stumbled on is:
>>>
>>>
>>>  Systems accepting IRIs MAY convert the ireg-name component of an  
>>> IRI
>>>  as follows (before step 2 above) for schemes known to use domain
>>>  names in ireg-name, if the scheme definition does not allow  
>>> percent-
>>>  encoding for ireg-name: Replace the ireg-name part of the IRI by  
>>> the
>>>  part converted using the ToASCII operation specified in Section 4.1
>>>  of [RFC3490] on each dot-separated label, and by using U+002E (FULL
>>>  STOP) as a label separator, with the flag UseSTD3ASCIIRules set to
>>>  TRUE,
>>
>>
>> Another point related to yours: UseSTD3ASCIIRules should be FALSE  
>> here.
>>  Those are rules on the *registration* of domain names, and I don't  
>> see what
>> they have to do in a specification that effectively deals with  
>> resolution.
>>
>> From a quick check using "_test0_α.does-not-exist.org" as a test  
>> case, it
>> seems like at least the latest Safari and Firefox don't set that  
>> flag when
>> trying to resolve an IRI reference.
>>
>> I did some archeology on the topic in March; the genesis of
>> UseSTD3ASCIIRules being TRUE goes back to this note from Martin:
>>   http://www.imc.org/idn/mail-archive/msg07277.html
>>
>> ... which seems to be mistaken about the intent of some of the text  
>> in the
>> original URI spec.
>>
>>>
>>> and with the flag AllowUnassigned set to FALSE for creating
>>>   IRIs and set to TRUE otherwise.  The ToASCII operation may fail,  
>>> but
>>>  this would mean that the IRI cannot be resolved.  This conversion
>>>  SHOULD be used when the goal is to maximize interoperability with
>>>  legacy URI resolvers.  For example, the IRI
>>>  "http://r&#xE9;sum&#xE9;.example.org"
>>>  may be converted to
>>>  "http://xn--rsum-bpad.example.org"
>>>  instead of
>>>  "http://r%C3%A9sum%C3%A9.example.org".
>>>
>>>
>>> Can someone explain the AllowedUnassigned set to FALSE for "creating
>>> IRIs"?  This is in the middle of the algorithm for converting IRIs
>>> (which is turning into converting 'parsed IRI components' into
>>> 'parsed URI components'), but what is the applicability of
>>> 'creating IRIs' when doing this mapping anyway?
>>
>> I'd think none, i.e., AllowUnassigned should be TRUE in this spot,  
>> for the
>> very reason that you mention.
>>
>>
>>
>

Received on Wednesday, 2 September 2009 18:09:05 UTC