Re: query on iregname conversion

I assume you are referring to IDNA2008:

http://tools.ietf.org/html/draft-ietf-idnabis-defs-10
http://tools.ietf.org/html/draft-ietf-idnabis-protocol-15
http://tools.ietf.org/html/draft-ietf-idnabis-tables-06
http://tools.ietf.org/html/draft-ietf-idnabis-bidi-05
http://tools.ietf.org/html/draft-ietf-idnabis-mappings-03
http://tools.ietf.org/html/draft-ietf-idnabis-rationale-11

The above documents leave many of the mapping questions open. There
are some discussions about a tighter mapping spec.

My opinion: IDNA implementations should never convert to Punycode if
the string contains any unassigned characters. If the incoming label
is already in Punycode, the implementation may look it up in DNS, even
if the de-Punycoded string would contain an unassigned character.
(Stricter implementations might refuse to look such labels up.)

Erik

On Wed, Sep 2, 2009 at 11:08 AM, Thomas Roessler<tlr@w3.org> wrote:
> On 2 Sep 2009, at 20:02, Erik van der Poel wrote:
>
>> It is indeed strange to have "creating IRIs" in the middle of the
>> section on mapping IRIs to URIs, but IDNA2003 implementations should
>> never call ToASCII with AllowUnassigned set to TRUE because they have
>> no way of knowing whether the unassigned character might become an
>> upper-case character in the future (which would have to be mapped to
>> lower-case then).
>
> What would the guidance be for IDNA2009 implementations?
>
>> Erik
>>
>> On Wed, Sep 2, 2009 at 10:32 AM, Thomas Roessler<tlr@w3.org> wrote:
>>>
>>> On 2 Sep 2009, at 19:11, Larry Masinter wrote:
>>>
>>>> I'm still working on a draft that turns the MAY into a MUST for
>>>> ireg-name processing; it winds up rewriting a lot of the
>>>> document because it puts parsing before percent-encoding.
>>>>
>>>> I'd rather wait to discuss this until I have a draft ready
>>>> (had hoped to finish yesterday).
>>>>
>>>> One section I've stumbled on is:
>>>>
>>>>
>>>>  Systems accepting IRIs MAY convert the ireg-name component of an IRI
>>>>  as follows (before step 2 above) for schemes known to use domain
>>>>  names in ireg-name, if the scheme definition does not allow percent-
>>>>  encoding for ireg-name: Replace the ireg-name part of the IRI by the
>>>>  part converted using the ToASCII operation specified in Section 4.1
>>>>  of [RFC3490] on each dot-separated label, and by using U+002E (FULL
>>>>  STOP) as a label separator, with the flag UseSTD3ASCIIRules set to
>>>>  TRUE,
>>>
>>>
>>> Another point related to yours: UseSTD3ASCIIRules should be FALSE here.
>>>  Those are rules on the *registration* of domain names, and I don't see
>>> what
>>> they have to do in a specification that effectively deals with
>>> resolution.
>>>
>>> From a quick check using "_test0_α.does-not-exist.org" as a test case, it
>>> seems like at least the latest Safari and Firefox don't set that flag
>>> when
>>> trying to resolve an IRI reference.
>>>
>>> I did some archeology on the topic in March; the genesis of
>>> UseSTD3ASCIIRules being TRUE goes back to this note from Martin:
>>>  http://www.imc.org/idn/mail-archive/msg07277.html
>>>
>>> ... which seems to be mistaken about the intent of some of the text in
>>> the
>>> original URI spec.
>>>
>>>>
>>>> and with the flag AllowUnassigned set to FALSE for creating
>>>>  IRIs and set to TRUE otherwise.  The ToASCII operation may fail, but
>>>>  this would mean that the IRI cannot be resolved.  This conversion
>>>>  SHOULD be used when the goal is to maximize interoperability with
>>>>  legacy URI resolvers.  For example, the IRI
>>>>  "http://r&#xE9;sum&#xE9;.example.org"
>>>>  may be converted to
>>>>  "http://xn--rsum-bpad.example.org"
>>>>  instead of
>>>>  "http://r%C3%A9sum%C3%A9.example.org".
>>>>
>>>>
>>>> Can someone explain the AllowedUnassigned set to FALSE for "creating
>>>> IRIs"?  This is in the middle of the algorithm for converting IRIs
>>>> (which is turning into converting 'parsed IRI components' into
>>>> 'parsed URI components'), but what is the applicability of
>>>> 'creating IRIs' when doing this mapping anyway?
>>>
>>> I'd think none, i.e., AllowUnassigned should be TRUE in this spot, for
>>> the
>>> very reason that you mention.
>>>
>>>
>>>
>>
>
>

Received on Wednesday, 2 September 2009 19:04:43 UTC