Re: Standardizing on IDNA 2003 in the URL Standard from John C Klensin on 2013-08-23 (www-tag@w3.org from August 2013)

From: John C Klensin <klensin@jck.com>
Date: Fri, 23 Aug 2013 11:46:41 -0400
To: Gervase Markham <gerv@mozilla.org>, John Cowan <cowan@mercury.ccil.org>
cc: Vint Cerf <vint@google.com>, "Jungshik SHIN (신정식)" <jshin1987+w3@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, IDNA update work <idna-update@alvestrand.no>, "PUBLIC-IRI@W3.ORG" <public-iri@w3.org>, uri@w3.org, "www-tag.w3.org" <www-tag@w3.org>
Message-ID: <606647274ECE5AD80E8508E1@JcK-HP8200.jck.com>

--On Friday, August 23, 2013 14:15 +0100 Gervase Markham
<gerv@mozilla.org> wrote:

> On 23/08/13 11:19, Mark Davis ☕ wrote:
>>  1. The TR46 non-letter support can be dropped in clients
>>  once the major registries disallow non-IDNA2008 URLs. I say
>>     URLs, because the registries need to not only disallow
>>     them in SLDs (eg http://☃.com), they /also/ need to
>>     forbid their subregistries from having them in Nth-level
>>     domains (that is, disallow http://☃.blogspot.ch/
>>     <http://blogspot.ch/> = xn--n3h.blogspot.ch
>>     <http://xn--n3h.blogspot.ch>).
> 
> This is not my area of expertise, but I am not aware of a
> registry which attempts to define by contract what their
> customers may or may not put into the DNS "below" the domain
> they have purchased.

Gerv,

At least historically, I am aware of such registries.  In the
pre-ICANN period, Section 3 of RFC 1591 contained the statement

 "Most of these same concerns are relevant when a
 sub-domain is delegated and in general the principles
 described here apply recursively to all delegations of
 the Internet DNS name space."

which was intended to make the sort of relationship we need here
just about mandatory.  My value recollection is that ICANN, in
its early days, attempted to impose similar "recursive
application" requirements in its contracts with registries.
That effort largely floundered for the delegation-only
registries that are probably a superset of what Mark considers
"major" because of the difficulties with imposed requirements
and enforcement (especially with ccTLDs but, in practice, with
many gTLDs as well).  I note in particular that, as far as I can
tell, the Applicant Guidebook does not impose any such
requirement on current-round new gTLD applicants, implying that
it is already too late to effectively "forbid" much of anything.

On the other hand, within enterprise-level domains (those whose
subdomains make up the FQDN case), my experience has been that
naming conventions and restrictions of various sorts are both
common and enforced.  Certainly not in all cases, but in enough
to be significant.

> The way to make such domains not exist is for them to first
> not work in browsers; I'm not sure we can do it the other way
> around.

That is precisely the chicken-and-egg problem I referred to in
my earlier note.  If nothing else, a browser-first approach has
the advantage of having to convince under a dozen implementer
communities while getting most registries (including zone
administrators deep in the tree) to behave in a particular way
requires convincing perhaps hundreds of millions of entities,
most of whom are not following these lists (and most of whom
don't care about issues that extend beyond their local languages
and scripts).

--On Friday, August 23, 2013 10:19 -0400 John Cowan
<cowan@mercury.ccil.org> wrote:

> Mark Davis ☕ scripsit:
> 
>> *also*need to forbid their subregistries from having them in
>> Nth-level domains
>>    (that is, disallow http://☃.blogspot.ch/ =
>>    xn--n3h.blogspot.ch).
> 
> Through what technical or social means would that be arranged?
> TLD registries have never had any control over their
> subregistries' use of names that I know of; I should think it
> would have to be implemented by contract between the registry
> and the subregistry, and many existing subregistries might
> well balk.

As noted above, "never" is too strong.  But, yes, in today's
world, contracts would be required and we already have empirical
experience with the "balking" part.

As Gerv suggests, the most effective mechanism involves
developers of broswers (and other applications that use the DNS)
making the transition in some way and thereby causing bottom-up
pressure on registries to avoid doing things that can cause name
conflicts or reference ambiguity.  

That could take many forms.  While I hate the idea of requiring
dual lookups, a browser that wished to be extra-careful about
conflicting names could look up both interpretations and, if it
found more than one (or two with different RR Sets), could
reasonably come back to the user with a "this may be a problem
or an attack, which one did you really want?" message, perhaps
even with a "if you don't like this story, complain to the
registry".  Especially for FQDNs, that would be far more
effective and more reliable than waiting on the registries and
hoping that they are all doing what we would like them to do.
(Yes, I understand the implementation problems with this,
especially when there is no guarantee that all zones in a
particular tree will consistently use one interpretation.  But
it would still be easier than convincing millions (or hundreds
of millions) of zone administrators and tabulating their status.)

best,
    john

Received on Friday, 23 August 2013 15:47:30 UTC