- From: Sam Sun <ssun@CNRI.Reston.VA.US>
- Date: Mon, 2 Mar 1998 11:48:38 -0500
- To: "Al Gilman" <asgilman@access.digex.net>, <uri@Bunyip.Com>, <urn-ietf@Bunyip.Com>
Al Gilman said: >...... >In the schemes that the URN community is contemplating, this is >probably not true. Once one enters a namespace discipline, one >may not expect interior namespaces to be randomly declared by >the values found for exterior names. > My observation is that relative URI defines a client side process for compounding names. Based on libwww.lib implementation, relative URI never leaves the client side by itself, but have to bind to the URI scheme in its base URI before it can be of any use. So, if URI is considered a machine to machine protocol syntax, is relative URI an URI? The URN working group defined the syntax for identifiers to be transferred over the wire. If I understand correctly, URN syntax is designed mainly as a machine to machine protocol syntax. If there were any relative URN to be defined, it would mean that the URN service could not be stateless, and have to keep history of previous transactions in order to construct compound names, which doesn't seem very practical. This leads to the question to what URI is. First, an observation: Some URI schemes, like “http:” or “urn:”, have the client side syntax follow the machine to machine protocol syntax. Some other URI schemes don’t. For example, the ftp server will not know to convert %23 to ‘#’, and when you send “ftp:user%23&pass%23word@foo.com”, the ftp server at “foo.com” will not recognize you are user “user#”, and entering password “pass#word”. Another example is LDAP whose protocol uses UTF-8 encoding, but the URL syntax seems to follow the http URL. It seems more natural to consider URI as a client side referral syntax. For any URI “foo:foo-specific-name”, the URI is responsible only to refer “foo-specific-name” to “foo:” module, but nothing more. Individual scheme should be allowed to decide how to parse its scheme specific data, and how to process the “#fragment”. Each scheme should be allowed to decide its own set of reserved/excluded characters, its character set encoding, and whether the client-side syntax follows the protocol syntax. If this is the case, it seems that for URI, the only reserved characters needed is byte ‘%25’, which is character % in ASCII encoding. And the only excluded character needed is byte ‘%22’, which is character ” in ASCII encoding. The ‘%25’ is needed to allow non-printable characters be entered and be understood. The ‘%22’ is necessary for separating URI from its surrounding context. Also, URI doesn’t have to be constrained to a subset of ASCII characters only, but should let individual URI scheme to decide how to support international character sets. Based on what I saw, the only strong arguments for URI to be ASCII only is that it is printable and can be entered from almost any (not all!) keyboard. These might be nice user interface features for “http:” URL, not necessarily for all other URIs. To be short, not every document is written to be readable by anyone around the world, nor would it necessary to require _every_ NAME to be defined printable and enterable by anyone around the world. It should be a decision of the name issuer, not the underlying technology. Essentially, I’m suggesting that the uniformity of URI should be only on its scheme binding syntax, as is commonly accepted in the web context, but not extend into the scheme specific content. Regards, Sam ssun@cnri.reston.va.us
Received on Monday, 2 March 1998 12:04:03 UTC