- From: Larry Masinter <masinter@adobe.com>
- Date: Thu, 3 Sep 2009 13:56:49 -0700
- To: Erik van der Poel <erikv@google.com>, "Roy T. Fielding" <fielding@gbiv.com>
- CC: "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
> The problem is that non-Internet domains are not limited to > ASCII and cannot use IDNA. For example, IRIs that are minted > inside a WINS-based network within a Russian corporation to > access its own intranet web site. We use the same software > to access those sites as we do the global Internet. > It is a shame that URLs/URIs/IRIs were not designed with multiple name > resolution protocols in mind. There's a tradeoff between a compact syntax and expressive power. There's a tradeoff between being ambiguous and thus extensible vs. being specific and locked down. The design goal of "cool URIs don't change" comes against the design goal of "URIs should be able to identify action unambiguously". If you want a more expressive description of exactly what is supposed to happen when, then perhaps URIs aren't the right protocol element to express that. > For example: > http://example.com:12345/ > The "http" tells us to use HTTP. But what is it that tells us to use > WINS instead of DNS? Trying DNS first and then WINS seems like a hack. > How long should the implementation wait for the DNS response? I think if "example.com" means something other than "example.com as looked up in DNS" then you need to communicate that information out of band. "within a Russian corporation" seems like a domain within which there is a configuration decision to use out of band information about using WINS instead of, on top of, before, or after using DNS. > I don't know what to suggest here... That WINS resolution is not mandated by the URI standard, and that services that rely on it are relying on out-of-band information. IPV6 transition seems to be in the same boat. Also srv-based resolution and several other proposals. >> Given this situation, I wonder if we could consider the following >> alternative plans. >> >> (1) If the domain name contains pct-encoded non-ASCII, reject the >> entire URI/IRI. (Do something reasonable with pct-encoded ASCII.) > There's no reason to do this, and the likelihood that previous IRI -> URI processors might be deployed make this not good advice. >> (2) If the domain name contains pct-encoded non-ASCII, pct-decode it >> and check for well-formed UTF-8. If it is UTF-8, convert to Punycode. >> If not, reject the URI/IRI. (Do something reasonable with pct-encoded >> ASCII.) > I like doing something reasonable with pct-encoded ASCII. I wish there were a "NaDN" like NaN, though, so that rather than 'rejecting' it at the URI/IRI processing level, the rejection could happen outside. > What about domain names in raw non-ASCII? > I believe the browsers are quite aligned here already. MSIE, Firefox, > Safari, Chrome and Opera all convert the entire HTML file to Unicode, > and then convert the domain names to Punycode. > I have no idea about non-Web apps (such as email). Worth looking at.
Received on Thursday, 3 September 2009 22:19:41 UTC