Re: What is at the end of the namespace? from David G. Durand on 2001-11-17 (uri@w3.org from November 2001)

From: David G. Durand <David_Durand@brown.edu>
Date: Sat, 17 Nov 2001 21:05:27 +0900
To: uri@w3.org
Message-Id: <4.2.0.58.J.20011117210522.061ed310@localhost>
Hi Roy!

This has my views as to why URNs were not a waste of time. I've included 
stuff that you surely know better than I do, but explained more 
sympathetically. I'm trying to talk to the larger group, not down to you.

At 12:40 PM -0800 11/16/01, Roy T. Fielding wrote:
>  > Are you saying that HTTP URLs are also URNs?
>
>No, URNs are only those URI that start with a "urn" scheme.  What I said is
>that HTTP URLs are identifiers, and hence names, and therefore capable of
>being a symbolic replacement for any other identifier, including URNs.

I was involved in the URN stuff for a long time, though peripherally. I 
stopped eventually because the other folks were more than competent to move 
things along without me, and because, like you, I got tired of the endless 
discussions like this. Discussions in which I've seen almost no-one change 
their minds, even about the problem's definition, because of the very wide 
divergences in perspective the people bring to the problem.

However, it's been some years, so here's my take on the issues and positions.

The nature of the resource identified is a red herring. The question is 
what method, if any, is suitable for obtaining the representation of a 
resource.

This is the place that URNs, http: URLs, and other URL formats _may_ be 
seen to differ.

>  > Does that mean that
>  > all of the work being done by the URN WG is for nothing?  Are they
>>  just wasting their time, since we already have HTTP URLs and can
>>  just use those?
>
>I have been saying that for the past eight years.  That doesn't mean it is
>a waste of their time, only that the solution to persistent naming isn't
>obtained merely by changing the scheme name.

It is indeed possible to use any string as a name, and any anme must be 
supported by a social/technical infrastructure that defines its properties 
and utilities.

The http: scheme is different from the ftp: scheme, although both can serve 
as a name infrastructure (given social/technical support). The difference 
between them is that each has a formal, standard definition of how to 
request a digital representation (message body/file contents) for a given 
resource. The protocol for FTP is very limited, supporting binary transfer 
of data, and character conversion. The protocol for http: is very rich, 
supporting independent data format, and character encoding conversions as 
well as caching, etc. These schemes differ in their technical 
infrastructure, but they both provide a mapping from identifiers to data, 
based on a standard protocol.

A user-agent is free in principle to resolve http: URLs in any convenient 
way. However, if that user agent resolves a URL in a way which returns 
different results than would be obtained by using HTTP, then that agent can 
be plausibly said to be broken. A great deal of HTTP 1.1 is devoted to 
enabling "correct" caching of data by arbitrary programs, within parameters 
of correctness as set by the server and conveyed by HTTP headers.

In other words one is free to resolve http: URLs by any means one wants to 
use, but use of any other method than HTTP is not standardized, and thus is 
not interoperable between applications. At some future date, there may some 
"redirection registry" that will resolve old URLs in a canonical way 
(perhaps by date?).

URNs were created to satisfy a different set of needs, and, in consequence, 
make a radically different tradeoff between social and technical 
infrastructure. URNs are specifically intended for names that are intended 
to be _persistent_ and _location independent_.

"Persistent" means that there is no upper bound on the lifetime of a URN -- 
librarians like to think in terms of decades to hundreds of years. To 
guarantee that a name won't be re-used over that kind of timespan is not 
basically a technical issue, but a social one, because our software seems 
almost certain to undergo radical change over that timescale.

"Location independent" means that when you assign a URN, you are naming 
something, but _not_ picking a preferred protocol for fetching it. You can 
commit to persistence with http:, and the w3c advocates this, but currently 
this is rarely done. There are social obstacles, as well, as there's no 
guarantee that you can keep a domain name forever; nor is there a standard 
way to indicate to software that HTTP is _not_ a suitable protocol for 
fetching the resource at the end of an http: URL. And of course, it's 
possible that maintaining a web server will no longer be the preferred way 
to provide a resource, because of software changes.

Of course, resolving names is nice, and there is a network protocol (NAPTR) 
that can be used to turn a URN into a URL -- using any scheme. That's 
great, because it provides a technical infrastructure for making the 
retrieval of URNs easy, _if the owners choose to use it_. The point of a 
URN is to have a scheme that exploicitly warrants the use of arbitrary 
retrieval mechanisms.

There was another comment that I wanted to respond to:

At 1:05 PM -0800 11/16/01, Roy T. Fielding wrote:
>The only chaos I have seen is in the writings of more recent specifications
>that ignore the research and experience of the Web developers in favor of
>their own personal view of an ideal world.  When they implement something
>that works and has the same expressive power as the Web itself, then I will
>take their writings seriously.

A lot of the issues raised in the URN debate were raised by people from a 
library background, and librarians have been devising reference systems for 
a long time.

I could counter your ad-hominem argument by saying "I'll listen to the web 
folks when they've created a URL that was usable after 50 years." I'm not 
saying that, and the IETF groups didn't either, because it's not 
productive. Both perspectives have good ideas and techniques to offer.

URNs are just names with an agreement in advance that any resolution method 
that "works" is acceptable, where "works," like the notion of "resource" is 
a fuzzy one, ultimately defined by human beings.

   -- David
--
-------------------------------
David Durand
Chief Scientist, Scholarly Technology Group
Adjunct Associate Professor, Computer Science.
Brown University
Cell: 401-935-5317
email: David_Durand@brown.edu

commercial .sig:
VP, Software Architecture
ingenta plc
Received on Sunday, 18 November 2001 03:39:33 UTC