Re: Clarifying what a URL identifies (What Part of 'Resource' Don't I Understand?) from David Booth on 2003-01-23 (www-tag@w3.org from January 2003)

From: David Booth <dbooth@w3.org>
Date: Thu, 23 Jan 2003 04:05:10 -0500
To: Tim Bray <tbray@textuality.com>
Cc: www-tag@w3.org
Message-Id: <5.1.0.14.2.20030123004733.0293b200@localhost>
Tim,

Thank you for your detailed responses.  I certainly sympathize with the 
experts' frustration in dealing with "newbie" questions, and I wish I could 
help by offering other "newbies" clearer explanations of things like 
"resource" than currently exist in RFC2396 (so that the experts wouldn't 
have to be pestered with such questions), but until I can understand them 
better, I can't do that.

I agree with *almost* all of what you've said.   Comments are interspersed 
below.

At 03:38 PM 1/21/2003 -0800, Tim Bray wrote:
>David Booth wrote:
>>  . . . I don't understand what it means to say that a URI denotes a 
>> "resource".
>
>I suggest you read RFC2396 and the Webarch draft.

You can bet I have.  I've carefully analyzed the definition of "resource" 
in RFC2396, and I can declare with complete confidence: that definition is 
hopelessly confusing.  For my (painfully) detailed analysis, see:

         "What Part of 'Resource' Don't I Understand?" at
         http://www.w3.org/2002/11/dbooth-names/dbooth-rfc2396-analysis_clean.htm

>When I say a formalism I mean formalism.   A resource is per RFC2396 
>"anything that has identity" and a URI is that which identifies a resource.
>
>A resource, thus defined, has access mechanisms whereby you can retrieve 
>and update representations.  This formalism is complete, consistent, and 
>highly robust in practice, underlying the construction of the most 
>succesful information system in history.
>
>I admire your chutzpah in charging here and making claims about the 
>undefinedness of the term "Resource" but that doesn't mean you're anything 
>but hopelessly wrong.

Well, the concept of a "resource" may be a perfectly well-defined notion 
(in a mathematical sense) in the formalism that you mention.  However, the 
term "resource" definitely is NOT very clearly defined (in an expositional 
sense) in the RFC2396 document.  This is not necessarily the TAG's problem, 
but it clearly is a fact of life that contributes to this festering issue.

>You go on to observe correctly that once you step outside the formalism, a 
>resource can in fact be all sorts of different things, and that it would 
>help if we had a way to talk about what kind of thing it is.  I agree with 
>all of that.  However, the web architecture as it stands works just fine 
>without being able to talk about what any particular resource "is" aside 
>from "that which is identified by its particular URI".

Mostly agreed.  For the human-oriented Web, it isn't a problem.  For the 
Semantic Web, it *will* be a problem if people try to take the 
name-oriented approach[1] to identifying things.  But it *won't* be a 
problem if people take the context-oriented approach[2] to identifying 
things.  I think the context-oriented approach is more sensible, so we're 
in (violent?) agreement about this.

>In the Web Architecture formalism, http://x.org/love identifies only one 
>resource.  In the real world, I can learn about that resource by 
>retrieving representations of it (if any are available), and more by 
>processing RDF assertions about it (if any are available).  The Web 
>architecture doesn't talk about meanings, it talks about resources and 
>representations.  There's nothing wrong with talking about meaning, and I 
>look forward to the day when I can reliably retrieve some RDF assertions 
>and learn that this particular URI identifies nothing but a JPG of a cute 
>cat, and this other one identifies the inner thought of a drug-addled 
>conceptual artist.  This would be good and useful.

Good.

>I think your proposed taxonomy of the kinds of things that a resource 
>might be (name, concept, Web location, or document instance) to be incomplete

Hmm, two responses here:
1. Yes, that taxonomy only applies when you're trying to use a URL to 
identify an abstract concept, such as using http://x.org/love to identify a 
particular concept of love.  The term "concept" should be broadened if it 
is also going to include physical objects that do not exist on the 
Web.  And I haven't yet figured out whether it's sensible to broaden it to 
also cover things that *are* on the Web.

2. But there may be a more subtle miscommunication here.  My taxonomy is 
not trying to be a taxonomy of the universe.  Assuming I'm using a URL to 
identify something in the universe such as an abstract concept, my 
"name/concept/Web location/document instance" taxonomy is only for 
classifying the related things that one might commonly wish to identify 
using that same URL.

>  - the universe of resources already includes physical robots and other 
> devices that you can control, then there are streaming resources; also 
> you may be comfortable with sweeping resources as varied Dan's car, the 
> W3C, and an XML namespace under a rug labeled "concept" but I'm not.

I agree, a different term would be appropriate.

>I don't think we're nearly ready to cook the general semantics of 
>resources into Web Architecture, among other things I haven't seen 
>working, scalable software to give an existence proofs that any particular 
>approach to this is sound.

I agree.

>I agree with your observation that a URI can serve more than one of the 
>functions in your taxonomy, and that it might be useful to have a way to 
>say "when http://www.w3.org is used in content XX, it is being used only 
>as a name".

Yes.

>I don't have a notion of resource.  A resource is what RFC2396 says it is, 
>and as a programmmer I'm working with that definition and the other useful 
>specifications that grow out of it.

Yes, I meant the notion of resource that is defined in RFC2396.

>At the moment, speaking for myself, my impression is that the TAG has no 
>intention of saying anything beyond what's in 2396 and the Webarch draft.

Fine.  It would help me (and others) to have a clearer definition of 
"resource" than what's in RFC2396, but that may not be the TAG's 
responsibility to provide.  Heck, I''ll be glad to provide one myself if I 
can figure it out!

I do think the TAG should resolve the httpRange-14 issue, but it isn't yet 
clear to me how it should be resolved.

>The reason I'm willing to put so much energy into this is that I agonized 
>for a long time over the fact that in reality URIs identify lots of 
>different kinds of things and everybody was ignoring this elephant in the 
>room.  Weirdly enough, this angst never got in the way of my building 
>spiders and search engines and visual maps of webspace and all sorts of 
>other useful things.  It is quite possible that the Web Architecture works 
>*because* it works around the intractable problems of meaning and only 
>deals with comparing identifiers and shuffling representations around; 
>avoiding a lot of problems that historically have been intractable.

I agree!

>I think that if the Semantic Web is going to get anywhere, it's going to 
>have to deal with the Web as it is, and the Web as it is just doesn't know 
>or care what a resource "is".

I agree.

>Furthermore, I'm not convinced that the semantics of resources are 
>anything like a usably-simple function of the context in which a URI is used.

That's an interesting thought.  I'm curious to understand what you mean.

>Furthermore, I am convinced that any attempt to use technology or W3C 
>Recommendations to banish the demon of ambiguity from the use of 
>identifiers is a chimera and a waste of time.  -Tim

I agree.

Again, thanks for your very thoughtful comments.

1. http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm#DifferentNames
2. 
http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm#DifferentContext


-- 
David Booth
W3C Fellow / Hewlett-Packard
Telephone: +1.617.253.1273
Received on Thursday, 23 January 2003 04:06:07 UTC