Re: iDNR, an alternative name resolution protocol

Sam X. Sun (ssun@CNRI.Reston.VA.US)
Wed, 2 Sep 1998 01:44:35 -0400

Message-ID: <003e01bdd634$c7255d00$>
From: "Sam X. Sun" <ssun@CNRI.Reston.VA.US>
To: "Larry Masinter" <>
Cc: "URI distribution list" <uri@Bunyip.Com>
Date: Wed, 2 Sep 1998 01:44:35 -0400
Subject: Re: iDNR, an alternative name resolution protocol

I might have misunderstood what URI syntax governs. So here is the question:

Web browsers (e.g. Netscape or IE) have a edit box for user to enter their
URLs. In Netscape, it's called "Location:". In IE, it's called "Address:".
Now the question is: does the URL syntax governs how users should enter
their URL into the edit box, including the encoding used?

>> However, it seems that the URI defined for network protocol may have
different set of
>> requirements from URI targeted for human communication.
>Different requirements are placed on URIs by each context, but there is an
>overriding requirement for a single kind of identifier which is useful in
both contexts.

>Creating a uniform way of encoding typed characters into URIs as they are
>entered has an enormous advantage, in that it is likely to work and to
allow users
> who read and write languages that are not ASCII to do so independent of
>'scheme' of the URI. This seems to be very powerful and useful in bringing
>coherence to the web.

I understand the advantage of having a uniform encoding. The question is
whether it is practical. I'm worried about the "overriding requirement"
being too strict that real world practice could not follow it. For example,
most native platforms (e.g. Chinese or Japanese) don't have the UTF-8 input
method support. Such it seems not practical (at least at the current time)
to require a single encoding to enter URI.

A more practical way seems to default to UTF-8 encoding, but allow other
encoding to be used to enter the URI (into the web browser's URL edit box)
as long as the encoding is specified. For example, a Chinese BIG5 encoded
URI may thus be entered as "hdl:enc=big5@{big5 string}". It's not pretty,
but may be more acceptable than totally forbid any native encoding to be
used to enter the URI?