Re: What is at the end of the namespace? from Roy T. Fielding on 2001-11-19 (uri@w3.org from November 2001)

From: Roy T. Fielding <fielding@ebuilt.com>
Date: Mon, 19 Nov 2001 12:49:24 -0800
To: "David G. Durand" <David_Durand@brown.edu>
Cc: uri@w3.org
Message-ID: <20011119124924.F1384@waka.ebuilt.net>
Hi David,

> This has my views as to why URNs were not a waste of time. I've included 
> stuff that you surely know better than I do, but explained more 
> sympathetically. I'm trying to talk to the larger group, not down to you.

Whoa, that's not what I meant at all.  Studying, exploring, discussing and
implementing the wide space of persistent names in general, including URN,
was not a waste of time at all -- it was (and is) great work that will
continue to be built upon over time.  I was talking about the years of
argument over whether it was necessary for the scheme name to indicate
that the identifier was a persistent name in order for something to be
used as a persistent name.

I also feel that departure from the URL hierarchical syntax in the
definition of the "urn" scheme was a waste of time, but we weren't
talking about that.

> The nature of the resource identified is a red herring. The question is 
> what method, if any, is suitable for obtaining the representation of a 
> resource.
> 
> This is the place that URNs, http: URLs, and other URL formats _may_ be 
> seen to differ.

Perhaps, but not in the way you are thinking.  I spent a lot of time trying
to craft an explanation of why the scheme does not represent the actions
to be taken in retrieving a representation, but RFC 2396 was as much as
I could get everyone to agree to.

> The http: scheme is different from the ftp: scheme, although both can serve 
> as a name infrastructure (given social/technical support). The difference 
> between them is that each has a formal, standard definition of how to 
> request a digital representation (message body/file contents) for a given 
> resource. The protocol for FTP is very limited, supporting binary transfer 
> of data, and character conversion. The protocol for http: is very rich, 
> supporting independent data format, and character encoding conversions as 
> well as caching, etc. These schemes differ in their technical 
> infrastructure, but they both provide a mapping from identifiers to data, 
> based on a standard protocol.

That last part is misleading.  HTTP provides a mechanism for communicating
with a server which, in the specific case of an origin server, is capable of
acting as agent on behalf of the naming authority.  HTTP's definition
of "mapping from identifiers to data" is left entirely to the implementation
or desires of that origin server.  The "ftp" URL, on the other hand, defines
a sequence of client-server actions that is fairly specific to directories
and files.  The difference is that an HTTP name resolution process has the
whole URI to play with in one action, and thus can encompass a larger scope
of identifiers.  Also, HTTP proxying was specifically designed with this
purpose in mind, unlike any other protocol at the time.

In any case, these distinctions are only made during the process of name
resolution, which is a process that only gets invoked in certain contexts.

> A user-agent is free in principle to resolve http: URLs in any convenient 
> way. However, if that user agent resolves a URL in a way which returns 
> different results than would be obtained by using HTTP, then that agent can 
> be plausibly said to be broken. A great deal of HTTP 1.1 is devoted to 
> enabling "correct" caching of data by arbitrary programs, within parameters 
> of correctness as set by the server and conveyed by HTTP headers.
> 
> In other words one is free to resolve http: URLs by any means one wants to 
> use, but use of any other method than HTTP is not standardized, and thus is 
> not interoperable between applications. At some future date, there may some 
> "redirection registry" that will resolve old URLs in a canonical way 
> (perhaps by date?).

Yes, and it seems to be a very popular service:

   http://www.archive.org/index.html
   http://web.archive.org/web/*/http://www.ics.uci.edu/~fielding/

There are also a dozen different ways to resolve a URI via caching
mechanisms that don't use HTTP to communicate.

> A lot of the issues raised in the URN debate were raised by people from a 
> library background, and librarians have been devising reference systems for 
> a long time.

A lot of librarians were involved in defining URLs as well.  I have
consistently said that the only way to achieve true persistent names is
to supply the same infrastructure as that provided by libraries.
There is no substitute for their dedication and effort.

> I could counter your ad-hominem argument by saying "I'll listen to the web 
> folks when they've created a URL that was usable after 50 years." I'm not 
> saying that, and the IETF groups didn't either, because it's not 
> productive. Both perspectives have good ideas and techniques to offer.

Sorry, this was completely misdirected.  Librarians were not advocating
any of the things that I was referring to in that message -- the only
people who argued for ONE name system and a new syntax were technologists.
Librarians, for the most part, simply wanted a standard.  Librarians already
use multiple name systems in their catalogues.

> URNs are just names with an agreement in advance that any resolution method 
> that "works" is acceptable, where "works," like the notion of "resource" is 
> a fuzzy one, ultimately defined by human beings.

The process of resolution and the process of naming are two different things.

....Roy
Received on Monday, 19 November 2001 15:54:13 UTC