Re: Clarifying what a URL identifies (Four Uses of a URL) from Tim Bray on 2003-01-21 (www-tag@w3.org from January 2003)

From: Tim Bray <tbray@textuality.com>
Date: Tue, 21 Jan 2003 15:38:40 -0800
To: David Booth <dbooth@w3.org>
Cc: Michael Mealling <michael@neonym.net>, www-tag@w3.org, "Roy T. Fielding" <fielding@apache.org>, Dan Connolly <connolly@w3.org>, Sandro Hawke <sandro@w3.org>
Message-ID: <3E2DDA00.2080906@textuality.com>
David Booth wrote:

>> . . . .  The Web Architecture has a formalism called a "Resource" 
>> which is the one thing that corresponds to each URI.  . . . .
> 
> I find the word "resource" to be ambiguous.  I understand what I mean by 
> saying that a URI denotes a "name" or a "concept" or a "Web location" or 
> a "document instance", but I don't understand what it means to say that 
> a URI denotes a "resource".

I suggest you read RFC2396 and the Webarch draft.  When I say a 
formalism I mean formalism.   A resource is per RFC2396 "anything that 
has identity" and a URI is that which identifies a resource.

A resource, thus defined, has access mechanisms whereby you can retrieve 
and update representations.  This formalism is complete, consistent, and 
highly robust in practice, underlying the construction of the most 
succesful information system in history.

I admire your chutzpah in charging here and making claims about the 
undefinedness of the term "Resource" but that doesn't mean you're 
anything but hopelessly wrong.

You go on to observe correctly that once you step outside the formalism, 
a resource can in fact be all sorts of different things, and that it 
would help if we had a way to talk about what kind of thing it is.  I 
agree with all of that.  However, the web architecture as it stands 
works just fine without being able to talk about what any particular 
resource "is" aside from "that which is identified by its particular URI".

> If a URI denotes only one thing (called a "resource"),  then which of 
> those four things (name, concept, Web location, or document instance) 
> does "http://x.org/love" denote?  If your answer is "it depends", then 
> it seems to me that the meaning of a URI is determined by context.  In 
> that case, the statement that "a URI corresponds to one resource" seems 
> no more helpful than saying "a URI corresponds to one URI".  I.e., it 
> doesn't give me any greater understanding of the situation.

In the Web Architecture formalism, http://x.org/love identifies only one 
resource.  In the real world, I can learn about that resource by 
retrieving representations of it (if any are available), and more by 
processing RDF assertions about it (if any are available).  The Web 
architecture doesn't talk about meanings, it talks about resources and 
representations.  There's nothing wrong with talking about meaning, and 
I look forward to the day when I can reliably retrieve some RDF 
assertions and learn that this particular URI identifies nothing but a 
JPG of a cute cat, and this other one identifies the inner thought of a 
drug-addled conceptual artist.  This would be good and useful.

I think your proposed taxonomy of the kinds of things that a resource 
might be (name, concept, Web location, or document instance) to be 
incomplete - the universe of resources already includes physical robots 
and other devices that you can control, then there are streaming 
resources; also you may be comfortable with sweeping resources as varied 
Dan's car, the W3C, and an XML namespace under a rug labeled "concept" 
but I'm not.  I don't think we're nearly ready to cook the general 
semantics of resources into Web Architecture, among other things I 
haven't seen working, scalable software to give an existence proofs that 
any particular approach to this is sound.

I agree with your observation that a URI can serve more than one of the 
functions in your taxonomy, and that it might be useful to have a way to 
say "when http://www.w3.org is used in content XX, it is being used only 
as a name".

> On the other hand, if you say that your notion of "resource" always 
> corresponds to my notion of "concept" (for example), then I think I 
> understand.

I don't have a notion of resource.  A resource is what RFC2396 says it 
is, and as a programmmer I'm working with that definition and the other 
useful specifications that grow out of it.

> URLs *are* used in conjunction with denoting (at least) four kinds of 
> things.  That's reality.  I say "in conjunction with" because the 
> question of whether URLs can *directly* denote more than one of these 
> four kinds of things depends on your viewpoint.  If you take the 
> "different names for different uses" approach that I described, then a 
> URL denotes only one of these four things, and the TAG had better 
> clarify which one it is!  On the other hand, if you take the "different 
> context for different uses" approach, then the context indicates which 
> of the four things is denoted, and the TAG does not necessarily have to 
> say how that context should be indicated.  (Though it would be helpful 
> to have standard conventions.)

At the moment, speaking for myself, my impression is that the TAG has no 
intention of saying anything beyond what's in 2396 and the Webarch draft.

The reason I'm willing to put so much energy into this is that I 
agonized for a long time over the fact that in reality URIs identify 
lots of different kinds of things and everybody was ignoring this 
elephant in the room.  Weirdly enough, this angst never got in the way 
of my building spiders and search engines and visual maps of webspace 
and all sorts of other useful things.  It is quite possible that the Web 
Architecture works *because* it works around the intractable problems of 
meaning and only deals with comparing identifiers and shuffling 
representations around; avoiding a lot of problems that historically 
have been intractable.

> As far as I can tell, either approach can work for the Semantic Web 
> (provided every Semantic Web language clearly indicates how such context 
> should be indicated). 

I think that if the Semantic Web is going to get anywhere, it's going to 
have to deal with the Web as it is, and the Web as it is just doesn't 
know or care what a resource "is".  Furthermore, I'm not convinced that 
the semantics of resources are anything like a usably-simple function of 
the context in which a URI is used.  Furthermore, I am convinced that 
any attempt to use technology or W3C Recommendations to banish the demon 
of ambiguity from the use of identifiers is a chimera and a waste of 
time.  -Tim
Received on Tuesday, 21 January 2003 18:38:45 UTC