Re: comments on draft-eastlake-cturi-01.txt from Roy T. Fielding on 2001-01-22 (uri@w3.org from January 2001)

From: Roy T. Fielding <fielding@ebuilt.com>
Date: Mon, 22 Jan 2001 17:18:24 -0500
To: Graham Klyne <GK@NineByNine.org>
Cc: Dan Connolly <connolly@w3.org>, Larry Masinter <masinter@Adobe.COM>, Aaron Swartz <aswartz@swartzfam.com>, uri@w3.org, Donald Eastlake 3rd <dee3@torque.pothole.com>, Michael Mealling <michaelm@netsol.com>, Ted Hardie <hardie@equinix.com>
Message-ID: <20010122171824.A906@waka.ebuilt.net>
On Mon, Jan 22, 2001 at 10:16:30PM +0000, Graham Klyne wrote:
> At 09:25 PM 1/18/01 -0600, Dan Connolly wrote:
> >Let's not pretend that a different URI scheme
> >will somehow magically provide more stability
> >than http/dns; stability depends on
> >social practices, not just technology.
> 
> I agree.  In which case, it seems reasonable to use a form of URI whose 
> very form carries a clear signal of stable intent.  To my mind, that is the 
> big advantage of using a URN form.

But that is a myth.  The only reason that the URN form appears
to be stable is because almost nobody is using it, and therefore it
hasn't been subjected to the social pressures of useful names.

URIs are names assigned by naming authorities that are identified at
multiple hierarchical spots within the URI syntax, for both URL and URN.
URL schemes typically use the well-established mechanism of DNS for
managing the space of naming authorities, but there is no guarantee that
the name can be resolved into something retrievable (there never has been
such a guarantee, as discussed in my MOMspider paper long ago, nor is one
possible regardless of the syntax or number of URN redirections).
Namespace change is only one of many reasons why resources disappear.

URN schemes have a separate, less-established mechanism for doing the
same thing independent of DNS.  They don't have anything which is more
stable than DNS or less subject to environmental change and corporate
IP issues.  The only significant difference between the two is the
"urn:" prefix and the lack of any standard for (or, rather, the refusal
to allow) short-hand forms like relative URI.  That and the fact that
nobody is willing to pony up the cash to build and manage the
infrastructure needed to support a non-DNS naming-authority allocation
mechanism capable of handling billions of names and guaranteed to stay
in existence longer than my own apps.  Thus, the only useful
URN mechanism is that of GUIDs, which simply waste a few more bytes
in order to prepend "urn:" and lay claim to some special status that
only throw-away names can achieve.

This whole debate is irrelevant.  Protocol tokens are names that
are assigned by some naming authority or left to chaos (x-prefix).
The fact that there exists a naming authority, combined with a
name, implies that there exists a URL which can be constructed
(at invocation-time, if necessary) to form a unique name that can
be imbued with the semantics of any given protocol.  The only
thing that is required is that all parties of the communication
share the same context with which to interpret the tokens such
that they can be resolved into an absolute URI that is the name.

For example, there is a protocol called HTTP/1.1 that has semantics
defined by the IETF and an associated set of tokens.  I therefore
claim that there exists a URI

    p://ietf.org/http/1/1/request/method/GET

with which I can establish unique naming within an application
and interoperate consistently with other applications provided
that I trim off the prefix implied by the context whenever I use
it within a protocol exchange.  The exact absolute URI can even
differ between applications, provided that they all mirror the
same semantic namespace.  HTTP applications get that context from
the Protocol-Version included with every message.

Think about it.  We already do the above internally when an
application is developed.  We look to a naming authority (IETF)
to define a standard protocol (HTTP) with a standard namespace
for use in each semantic context of interoperable communication.
All this does is rephrase that shared context into an imaginary URI.

Now, imagine that rather than assuming a protocol token is always
relative to the context, we change the protocol parser to allow each
value to be a URI reference.  Every protocol element that calls for
a token must then be parsed as a URI reference relative to a base URI
defined by the protocol context (which is, of course, defined by the
protocol-version field given at the beginning of each message and may
itself be an absolute URI).  In other words, the only change to a
standard ascii-based protocol exchange is that certain characters
(":" and "/" in particular) need to have reserved meanings within
the token name syntax.

The effective difference is that when someone wants to define an
extension token value that is not part of the base standard,
they have to include the whole URI or convince the protocol's
naming authority to add it within the base namespace.  The claim
is that this method of extension is more meaningful and less
susceptible to collisions than both the "wait til the working
group gets around to it" and the chaotic (x-prefix) method
that do not work well today.

And all of this is without discussing the possible benefits
of being able to retrieve a description of semantics by performing
a GET on that currently-imaginary URI.

But this is all just a philosophical argument.  There just aren't
many protocols for which the benefit of long token names (let alone
absolute URI) is worth the expense of bits-on-the-wire.

....Roy
Received on Monday, 22 January 2001 20:22:07 UTC