W3C home > Mailing lists > Public > uri@w3.org > January 2014

RE: Standardizing on IDNA 2003 in the URL Standard

From: Shawn Steele <Shawn.Steele@microsoft.com>
Date: Thu, 30 Jan 2014 18:36:03 +0000
To: John C Klensin <klensin@jck.com>, Anne van Kesteren <annevk@annevk.nl>, Mark Davis ☕ <mark@macchiato.com>
CC: Bjoern Hoehrmann <derhoermi@gmx.net>, Andrew Sullivan <ajs@anvilwalrusden.com>, "PUBLIC-IRI@W3.ORG" <public-iri@w3.org>, "uri@w3.org" <uri@w3.org>, IDNA update work <idna-update@alvestrand.no>, www-tag.w3.org <www-tag@w3.org>
Message-ID: <b9ffddc047af4d1a8d73cd6c46512296@BY2PR03MB491.namprd03.prod.outlook.com>
I'd agree with John that next steps are to review & incorporate feedback.

IMO, I think that IDNA2008 didn't give enough consideration to the serious security and compatibility concerns that were raised, which is a large part of why 
UTS46 exists, and why it exists with the transitional form.

DNS provides a human pneumonic for connecting to machines.  There's nothing, even in IDNA2003, preventing perfectly the linguistically correct strings in dispute from resolving to an appropriate IP address or whatever for those machine(s).  To get to those machines, some of those strings are mapped, but you can still get there.  What is missing is the ability to differentiate between some names with close spelling, however DNS never guaranteed that, nor does it provide it, even in ASCII.  DNS also doesn't guarantee you'll "get" your name because someone else may have it.  (I imagine there're quite a few aaalocksmiths out there).

What seems to bother some people is that the canonical form doesn't match the input form.  If I type "Microsoft.com", I land at "microsoft.com", however Microsoft is a proper noun and should be capitalized.  (Or Google, or Apple).  Clearly there's no guarantee that the canonical DNS label is linguistically accurate.  Personally, I'd like to see a DNS record that said "the pretty form for this label is xxxxxx", but that is unlikely for many reasons.

Indeed, one of the original arguments was that the two forms needed to be distinct so that two names could be sold to two different customers, since they were linguistically different words.  However the recent discussion around resolving this issue assumes that "bundling" or "blocking" would be acceptable to resolve the security concerns.  Yet that bundling conflicts with the original ask for 2 distinct names.  If we don't want the names to be distinct (eg: they have to be bundled), we already have that for these code points with IDNA2003.

-Shawn

-----Original Message-----
From: idna-update-bounces@alvestrand.no [mailto:idna-update-bounces@alvestrand.no] On Behalf Of John C Klensin
Sent: Thursday, January 30, 2014 6:57 AM
To: Anne van Kesteren
Cc: Bjoern Hoehrmann; Andrew Sullivan; PUBLIC-IRI@W3.ORG; uri@w3.org; IDNA update work; www-tag.w3.org
Subject: Re: Standardizing on IDNA 2003 in the URL Standard



--On Wednesday, January 29, 2014 17:39 -0800 Anne van Kesteren <annevk@annevk.nl> wrote:

>...
> However, I think I have been convinced by this thread that UTS
> #46 might be good enough as replacement for IDNA2003. Once it  has 
>been clarified per the feedback I submitted I will  incorporate it in 
>the URL Standard. It's unfortunate that even
> #46 is implemented in different ways. :-(

This seems to me to be good progress; thanks.

The next task is probably to do what we discussed in August and then didn't follow up on:

(1) Review and, as appropriate, incorporate your feedback

(2) Recast UTR46 where necessary as an IDNA2008-based document with transition features _to_ it, rather than as an IDNA2003-based one with transition or preservation features _from_ it.

(3) It seems to me that that recasting includes making recommendations about transition conditions, even if only to more clearly state realistic considerations.  That, in turn, requires avoiding conditions like "when most of the registries have adopted recommended policies", if only because "most" is
impossible to measure.   It also requires recognizing that the
decision to change the handling of some previously-mapped-out characters (both the joiner subset of the formerly mapped-to-nothing group and some case-folding issues) was as much a conscious decision of some major registries in consultation with important language and writing system communities as of the IETF.  As a result, that decision deserves to be treated with more respect than proposed policies that would prevent those decisions from ever being useful would imply.

best,
    john


_______________________________________________
Idna-update mailing list
Idna-update@alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update

Received on Thursday, 30 January 2014 18:36:54 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:16 UTC