W3C home > Mailing lists > Public > uri@w3.org > August 2013

Re: Standardizing on IDNA 2003 in the URL Standard

From: Mark Davis ☕ <mark@macchiato.com>
Date: Fri, 23 Aug 2013 13:27:05 +0200
Message-ID: <CAJ2xs_EmpsKbqU++VGH7FO+D5m-1pGRfikabzYq8Tmc1=4pjwA@mail.gmail.com>
To: Vint Cerf <vint@google.com>
Cc: Jungshik SHIN (신정식) <jshin1987+w3@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, Gervase Markham <gerv@mozilla.org>, Shawn Steele <Shawn.Steele@microsoft.com>, IDNA update work <idna-update@alvestrand.no>, "PUBLIC-IRI@W3.ORG" <public-iri@w3.org>, "uri@w3.org" <uri@w3.org>, John C Klensin <klensin@jck.com>, Peter Saint-Andre <stpeter@stpeter.im>, Marcos Sanz <sanz@denic.de>, "www-tag.w3.org" <www-tag@w3.org>
All of the 4 characters are important and deserve support. And I agree that
they are a priority.

For the transition, TR46 supports the desired display for users, which is
far more important than distinguishing two different sites. That is, if you
type "größer.at <http://xn--grsser-xxa.at>" you'll see
"größer.at<http://xn--grsser-xxa.at>"
as the display in your address bar, and if you type
"grösser.at<http://xn--grsser-xxa.at>"
you'll see "grösser.at <http://xn--grsser-xxa.at>" as the display. They
will go to the same IDNA2003 address until we can flip off the transitional
bit.

(Note that we might be able to get general agreement among major clients to
support these on a per-TLD basis. So if .AT bundle/blocked ß in all of its
subdomains, then ß could be allowed in any .at domain name.)


Mark <https://plus.google.com/114199149796022210033>
*
*
*— Il meglio è l’inimico del bene —*
**


On Fri, Aug 23, 2013 at 1:01 PM, Vint Cerf <vint@google.com> wrote:

> Mark,
>
> thanks for the refinement. Is it possible that the browser makers would
> agree to discuss a plan and schedule to achieve these various objectives?
> My impression regarding the 4 deviation characters is that the Arabic users
> would benefit from IDN2008 treatment of the zero width characters so that
> seems to have some priority. The sharp-S continues to foster debate but it
> is a legitimate character and is used in normal cases and seems to deserve
> support. The trailing sigma question continues to produce controversy
> although the IDNA2008 committee finally concluded it deserved support.
>
> vint
>
>
> On Fri, Aug 23, 2013 at 6:19 AM, Mark Davis ☕ <mark@macchiato.com> wrote:
>
>> There are two different issues.
>>
>> A. The mapping is purely a client-side issue, and is allowed by IDNA2008.
>> So that is not a problem for compatibility.
>>
>> The most important feature of 'no mapping' IMO is on the registry side:
>> to make certain that registries either disallow mapping during the
>> registration process, or that they very clearly show that the resulting
>> domain name is different than what the user typed. While an orthogonal
>> issue to the client-side we're discussing here, it is worth a separate
>> initiative.
>>
>>
>> B. The transitional incompatibilities are:
>>
>>    1. Non-letter support
>>    2. 4 deviation characters
>>
>> Both of these are just dependent on registry adoption. The faster that
>> happens, the shorter the transition period can be. Note the transition for
>> each of these is independent, and can proceed on a different timescale.
>> Moreover, terminating the transition period doesn't need all registries to
>> buy in.
>>
>>    1. The TR46 non-letter support can be dropped in clients once the
>>    major registries disallow non-IDNA2008 URLs. I say URLs, because the
>>    registries need to not only disallow them in SLDs (eg http://☃.com),
>>    they *also* need to forbid their subregistries from having them in
>>    Nth-level domains (that is, disallow http://☃.blogspot.ch/ =
>>    xn--n3h.blogspot.ch).
>>    2. The TR46 deviation character support can be dropped in clients
>>    once the major registries that allow them provide a bundle or block
>>    approach to labels that include them, so that new clients can be guaranteed
>>    that URLs won't go to a different location than they would under
>>    IDNA2003. The bundle/block needs to last while there are a significant
>>    number of IDNA2003 clients out in the world. Because newer browsers have
>>    automatic updates, this can be far faster than it would have been a few
>>    years ago.
>>
>>
>>
>> Mark <https://plus.google.com/114199149796022210033>
>> *
>> *
>> *— Il meglio è l’inimico del bene —*
>> **
>>
>>
>> On Fri, Aug 23, 2013 at 11:49 AM, Vint Cerf <vint@google.com> wrote:
>>
>>>
>>> If we go down the IDNA2008 + TR46 path, I think we ought to be very
>>> explicit about a date certain to drop TR46 treatment so as to eliminate
>>> mapping and to instantiate the uniqueness properties of IDNA2008. Is that
>>> possible and what timeframe makes sense? The longer we wait, the harder it
>>> will be to get there.
>>>
>>> v
>>>
>>>
>>>
>>>
>>
>
Received on Friday, 23 August 2013 11:27:37 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:16 UTC