Re: iDNR, an alternative name resolution protocol from Sam Sun on 1998-09-04 (uri@w3.org from September 1998)

From: Sam Sun <ssun@CNRI.Reston.VA.US>
Date: Fri, 4 Sep 1998 12:37:16 -0400
To: "Larry Masinter" <masinter@parc.xerox.com>, "Martin J. Duerst" <duerst@w3.org>
Cc: "URI distribution list" <uri@Bunyip.Com>
Message-ID: <05d401bdd822$473ed630$1c1e1b0a@ssun.CNRI.Reston.Va.US>

>No, if you're going to update your software, update it to generate UTF-8,
>don't update it to add some encoding-declaration. That is, we _don't_
>want to recommend some new practice that will further the current situation
>where there is no interoperability.
>
>> This allows URI parsers to convert to UTF-8 (or any
>> other encoding used by the protocol) correctly without checking the
document
>> context.
>
>A 'URI interpreter' isn't a 'URI parser'. The parsing itself is simple.
>


According to the W3C implementation (http://www.w3.org/Library/), all that
'URI interpreter' does seems to 'parse' out the URI reference and hand it to
the protocol specific 'filters' (see
http://ssun.cnri.reston.va.us/ietf/w3c-libwww-5.1e/Library/User/Using/Filter
s.html) to 'interprete'.

For example, any "ftp URL" is handed to HTFTPParseURL() in HTFTP.c, and the
function HTFTPParseURL() will 'interprete' the "ftp URL" and get "uid",
"passwd", etc.

Because each network protocol does things (including use of encoding)
differently, I don't quite understand why it's necessary for the 'URI
interpreter' to care about the exact encoding.


>> Otherwise, it could be hard for URI parsers to figure out the
>> encoding of any particular URI, especially in multilingual document or on
>> platforms with multiple input methods installed.
>
>The point is that it doesn't need to 'figure it out'.
>
>> For example, the URI in HTML document may be defined as:
>>
>> <uri scheme> ":" [ <encoding> "@" ] <uri scheme specific string>
>>
>> The <encoding> is optional, and is not needed if the <uri scheme specific
>> string> uses UTF-8.
>


>This suggestion would continue to propagate non-interoperability and
>has no migration path.
>

I'm not sure I understand your points here. Could you elaberate? (I assume
we are talking about the "URI" encoding in the HTML document, not what get
transmitted over the wire.) I thought the suggestion would ENCOURAGE the use
of UTF-8 (because all other encoding requires extra typing). In the mean
time, for platforms where UTF-8 is not practical, it defines a machenism
that will help protocol specific 'filters' (e.g. HTFTPParseURL() ) to
correctly convert to the encoding used by the protocol.


Regards,
Sam

Received on Friday, 4 September 1998 12:45:59 UTC