Re: IDNA and IRI document way forward from Vint Cerf on 2009-07-29 (uri@w3.org from July 2009)

From: Vint Cerf <vint@google.com>
Date: Wed, 29 Jul 2009 05:09:04 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: ""Martin J. Dürst"" <duerst@it.aoyama.ac.jp>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>, URI <uri@w3.org>, "John Klensin (klensin@jck.com)" <klensin@jck.com>
Message-Id: <4769BE2D-DBCF-4F02-9E7E-39859399CBCE@google.com>

i do not believe it is possible under IDNA2008 (nor under IDNA2003?)  
to register percent-escaped strings in the domain name universe.  
double checking.

we did not deserve that last comment - but it was funny anyway.

v

On Jul 29, 2009, at 3:07 AM, Larry Masinter wrote:

> I confess that I'm just coming back up to speed on the
> issues, and hope you'll forgive me for missing some of
> the history,
>
> It seems there are at least two communities (IDN/IDNA and
> IRI/WEB) which should have been working together for
> the past many years, haven't been, and we're now facing
> some difficulties in bringing their perspectives together,
> especially when those perspectives have been built
> into long-standing and finely argued documents.
>
> I'm not entirely sure of the use case and difficulties,
> which I will try to track down in more detail.
>
> Just as personal speculation, however,
> I could easily imagine some problems if it were
> possible to register domain names which actually
> contained percent-hex-hex sequences.
>
> www.%77%33.org vs www.w3.org?
>
> Perhaps that would be a problem not just for IRIs
> but for other kinds of processing too.  Can this
> be disallowed at the URI parsing level? Only at
> the IRI level?
>
> I see the difficulties of creating a provision for
> scheme-specific parsing and restrictions on host names
> containing %xx hex-encoded bytes in URIs are even
> greater than what I imagined.
>
>
>> That would be
>> http://validator.w3.org/check?uri=http://恵比寿駅.jp/
>
> I'm sure there are difficulties even in circumstances that
> don't use "?", but this is especially difficult since the
> HTML-URL/HREF/WebAddress handling of non-ASCII query parameters
> adds some ambiguity to the translation of this into URI space.
>
>> It's very clearly impossible to rule this out.
>
> Difficult, but not impossible.
>
>> But even before that, doing scheme-wise processing
>> kills the U in URIs.
>
> And the I in Internationalized and several other things. Let's
> stick to identifying issues and alternatives.
>
>> I think this is unfortunate and a pretty drastic change
>> to the IRI document, but I don't think we're going to make
>> progress if we don't take the bull by the horns.
>
>> Before taking anything by the horns (or the tail, or whatever) I'd  
>> like
>> to know in great details what exactly the actual (or pretended)  
>> bull is.
>
> And if there there are two bulls there would be
> four horns, a Tetralemma, and quite a bit of BS.
>
> Larry
> -- 
> http://larry.masinter.net
>

Received on Wednesday, 29 July 2009 11:20:28 UTC