RE: query on iregname conversion

> The problem is that non-Internet domains are not limited to
> ASCII and cannot use IDNA.  For example, IRIs that are minted
> inside a WINS-based network within a Russian corporation to
> access its own intranet web site.  We use the same software
> to access those sites as we do the global Internet.

> It is a shame that URLs/URIs/IRIs were not designed with multiple name
> resolution protocols in mind. 

There's a tradeoff between a compact syntax and expressive
power. There's a tradeoff between being ambiguous and thus
extensible vs. being specific and locked down. The design goal
of "cool URIs don't change" comes against the design goal of
"URIs should be able to identify action unambiguously".
If you want a more expressive description of exactly what
is supposed to happen when, then perhaps URIs aren't the
right protocol element to express that.


> For example:

> http://example.com:12345/


> The "http" tells us to use HTTP. But what is it that tells us to use
> WINS instead of DNS? Trying DNS first and then WINS seems like a hack.
> How long should the implementation wait for the DNS response?

I think if "example.com" means something other than "example.com as
looked up in DNS" then you need to communicate that information
out of band.

"within a Russian corporation" seems like a domain
within which there is a configuration decision 
to use out of band information about using WINS
instead of, on top of, before, or after using DNS.

> I don't know what to suggest here...

That WINS resolution is not mandated by the URI standard,
and that services that rely on it are relying on out-of-band
information.

IPV6 transition seems to be in the same boat. Also srv-based
resolution and several other proposals.


>> Given this situation, I wonder if we could consider the following
>> alternative plans.
>>
>> (1) If the domain name contains pct-encoded non-ASCII, reject the
>> entire URI/IRI. (Do something reasonable with pct-encoded ASCII.)
>

There's no reason to do this, and the likelihood that
previous IRI -> URI processors might be deployed make
this not good advice.


>> (2) If the domain name contains pct-encoded non-ASCII, pct-decode it
>> and check for well-formed UTF-8. If it is UTF-8, convert to Punycode.
>> If not, reject the URI/IRI. (Do something reasonable with pct-encoded
>> ASCII.)
>
I like doing something reasonable with pct-encoded ASCII.
I wish there were a "NaDN" like NaN, though, so that rather than
'rejecting' it at the URI/IRI processing level, the rejection
could happen outside.

> What about domain names in raw non-ASCII?

> I believe the browsers are quite aligned here already. MSIE, Firefox,
> Safari, Chrome and Opera all convert the entire HTML file to Unicode,
> and then convert the domain names to Punycode.

> I have no idea about non-Web apps (such as email).

Worth looking at.

Received on Thursday, 3 September 2009 22:19:41 UTC