[Fwd: "U stands for Uniform"]

Larry Masinter (masinter@parc.xerox.com)
Wed, 21 Jan 1998 11:37:09 PST

Message-ID: <34C64E65.6C72FA4C@parc.xerox.com>
Date: Wed, 21 Jan 1998 11:37:09 PST
From: Larry Masinter <masinter@parc.xerox.com>
To: uri@Bunyip.Com, urn-ietf@Bunyip.Com
Subject: [Fwd: "U stands for Uniform"]

This is a multi-part message in MIME format.
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I couldn't understand why I didn't have any responses to this, but
perhaps mail isn't getting through?

Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Message-ID: <34BC3D46.C636742A@parc.xerox.com>
Date: Tue, 13 Jan 1998 20:21:26 -0800
From: Larry Masinter <masinter@parc.xerox.com>
Organization: Xerox PARC
X-Mailer: Mozilla 4.04 [en] (Win95; U)
MIME-Version: 1.0
To: uri@bunyip.com, urn-ietf@bunyip.com
Subject: "U stands for Uniform"
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I sent this privately, but I suppose it should go onto the working
group(s) mailing lists. After reconsideration, I am very strongly
in favor of moving forward with draft-fielding-uri-syntax (b), because:

U stands for Uniform.
   Two documents are Not Uniform: they're different!
   Uniform implies one document, one syntax.

If you want multiple documents, you want non-Uniform Resource
Identifiers, because you want different syntax definitions
for different kinds of things.

I believe that (b), complete with /, #, and  ?, is the best explanation
of scheme-independent URI behavior.

Patrik wrote:

> Leslie, I and some others want to cut "higher"
> up in the inheritance tree of syntax structure, so the URL specific things
> which are not (so far) part of URNs are out of the URI syntax document.

The generic URI document discusses some common syntactic elements that
are (or should be) processed by URI-handling systems independent of whether
those URIs are URLs or URNs or URZs. Those generic elements include "/", "#",
and "?". The generic elements may or may not be appropriate with some schemes,
and may or may not be appropriate for URNs, which are designated by introducing
them with the "urn" scheme. For "mailto", "#" is inappropriate, but "?" is
useful. For "mid", they're all inappropriate. For "data:", "#" might be
appropriate but not "/". And  for "urn", the appropriateness of "/"  "#",
and "?" are yet to be determined by the URN committee. Because they're yet
to be determined doesn't mean they're out of scope.

Something is "appropriate" if it has a defined meaning. If it's not defined,
then you shouldn't use it. If it is defined, then you can. Whether or not it
is defined is not an issue for the syntax, it's an issue for the semantics.
(We should take the word "semantics" out of the title of (b), since
the body of (b) talks entirely about syntax. I am not proposing any
other change to (b) than to change the title.)

If we need to add some wording to (b) to make it completely clear, OK.
It must be absolutely the case that the URN document gets to say whether
or not "#", "/", and "?" are appropriate for URNs, even though those
elements are defined in the generic URI document.
This is just the same, the "mailto" document should say whether or not and
how those elements work with the "vix" scheme; the "data" document should
define whether or not "#", "?", and "/" work for the data scheme.

I don't believe that (b) interferes with the URN committee's ability
to define URNs within the space of URIs, or the ability of the URN committee
to define a new kind of syntactic element which doesn't have the restrictions
of the current URI syntax (as long as we don't call that new thing a URI;
let's call it a EURI or XURI or whatever.) We're not constraining or restricting
development of new kinds of identifiers, we're just letting software developers
have standard specifications that they can be assured won't change out from
under them, and basing that standard on current interoperable implementations.

It's *important* that all URI processing software be assured that the URI
processing software knows that it doesn't have to first look up the scheme
before it does syntactic processing of "#", "?" and "/". We have to make
it CLEAR that those syntactic elements are completely scheme independent,
and the processing of them can be independent of whether the scheme is really
"urn" which has different rules of semantics.

Hiding the distinction by having two documents, one of which doesn't even
mention those elements would be WRONG.