Re: [httpRange-14] What do HTTP URIs Identify? from Tim Berners-Lee on 2002-07-31 (www-tag@w3.org from July 2002)

From: Tim Berners-Lee <timbl@w3.org>
Date: Wed, 31 Jul 2002 16:35:28 -0400
To: "Aaron Swartz" <me@aaronsw.com>
Cc: <www-tag@w3.org>
Message-ID: <003c01c238d3$92593550$0301a8c0@w3.org>
----- Original Message -----
From: "Aaron Swartz" <me@aaronsw.com>
To: "Tim Berners-Lee" <timbl@w3.org>
Cc: <www-tag@w3.org>
Sent: Monday, July 29, 2002 2:12 PM
Subject: [httpRange-14] What do HTTP URIs Identify?


> I got some bits upon dereferencing
> http://www.w3.org/DesignIssues/HTTP-URI that claim to have been written
> by TimBL. Some of them looked like:
> > The document itself is an important part of society - to dismiss its
> > existence is to prevent us being aware of human and aspects of
> > information without which we are impoverished.
>
> This seems to be the main problem with your argument. By claiming that
> HTTP URIs can represent abstract things, we are not dismissing the
> document! HTTP URIs can identify documents perfectly well, if you please.

Well, that leaves you with several choices, which I categorize
and you ahve to say which parth you take, or youare just arguing by
assertion.


> > If we stick with the principle that a URI (or URIref) must
> > unambiguously identify the same thing in any context, then we come to
> > the conclusion that URIs can not identify the web page. If a web page
> > is about a car, then the URI can't be used to refer to the web page.
>
> Ummm... huh? Just because a URI can identify a car doesn't mean it can't
> identify a web page.

Are you talking about same URI? If so, I am surprised, as I think of web
pages
and cars as being different. But if so, say, and we'll go on from there.
(Or see HTTP-URI.html section 2.1.1)

> If you want to know what the URI identifies you
> need to ask the publisher. If it identifies a car, then use RDF to say
> something like "the HTML representation I got from <URI> on <DATE>" --
> [ is html:representation of <URI>; dc:date "[DATE]" ].

Ok, you have taken the path 2.2, it would seem.

> In your view, URIs can't identify donkeys.

On the contrary, URIs can.  It is only some URI schemes which are defined
on particular classes of object.  Notably,

mailto: is defined on rdfc822 email endpoints
ftp: is defined on files (octet or line streams) and directories
http: is defined on documents which may or may not also
   be POST endpoints and/or webdav documents.


>I think we all agree donkeys
> are very important and I think it's awful you're trying to write them
> out of the Web Architecture. I'm going to call up the People for the
> Ethical Identification of Animals.

The uuid: schemes are pretty good for donkeys.

> > The problem with this is that there are a large number of systems which
> > already do use URIs to identify the document. This is the whole
> > metadata world.
>
> Well, most of this metadata is written by the creators of the page. If
> they say the page identifies a document and that document has a dc:title
> or an rss:title of such then that's fine with me. I'm not saying HTTP
> URIs *have* to represent donkeys, merely that they can. I don't see a
> problem here.

The problems I have described in the document, but in summary two of them
are
is that if the HTTP URI identifies the donkey, then what identifies the web
page?
and, if URIs can identify web pages or donkeys, how do we tell which for a
given
URI?

> > * The HTTP headers
>
> Most HTTP headers, as I believe I pointed out in the Expires: example,
> apply to the entity or representation, not the Resource. I believe those
> that apply to the Resource work just fine with it being a donkey.

See section 2.4.  This is the two-level apporach to
dividing propoerties into those which refer to the web page
and those which refer to the donkey.

> > You can argue that a web page indirectly identifies something, of
> > course, and I am quite happy with that.
>
> And, as I show above, you can indirectly talk about the web page using
> the URI to identify the thing. This works fine and requires less
> constraints on web architecture than your proposal.


Tricky, though, because many web pages can be represnetations
of the same thing. How many bytes, for example, are the in "the"
page which identifies Roy Fielding? There are many such pages.
It would require a constraint on the web architecture that
every concept had only one URI, which breaks the principle that
anyone can say anyhing about thing.  Apart from that it would work,
I think. (I deal with the opposite way of turning things around in a new
section 2.8.)

But you don't have to refer to a web page as "that page which
represents the concept ,x>".  youc an introduce the realtionship
between a URI string and its web page as an RDF property
directly, just separate from the identity property.
That would work too.

In fact, we could resolve thewhole thing and say that
neither mapping is fundamental, remove the axiom that URIs
"identify" things, and introdcue two new relations
"pointsToOnTheWeb" and "identifiesInTheSemanticWeb".
This is an architectural schism between WWW and semantic
web which I had been hoping not to have. The WW lookup
function becaoems a foreign function in semantic web space
(and vice-versa).

Cwm's log:semantics would then be a level-breaker, in that
it would mean "the logical meaning found by parsing the
content of the document whose web address is the
same string as the identifier of x".


> > Conclusion so far: the idea that a URI identifies the thing the
> > document is about doesn't work because we can only use a URI to
> > identify one thing and we have and already do use it to identify
> > documents on the web.
>
> Again, you are confusing the ability for a web page to identify a donkey
> with the requirement that it does. I would argue that the web page only
> identifies the donkey if one was careful to state that it did. An
> example of such a page is: http://logicerror.com/myWeavingTheWeb

So what, sir, is the algorithm for determining that no one had carefully
stated tha a web page was a donkey?

> I think adding the new Repr-Type and Resource-Type headers that Sean B.
> Palmer (I think) proposed would be helpful for this sort of thing. That
> way I could make clear that the page was a physical book in an HTTP
> sense.
>
> Hm, maybe you cover this in view 2.2... Let's try that.
>
> > I read a web page, I like it and I am going to annotate it as being a
> > great one -- but first I have to find out whether the URI my browser is
> > used, conceptually by the author of the page, to represent some
> > abstract idea? Before I recommend the Vietnam War page, I have to be
> > careful I am not recommending the Vietnam War.
>
> This is easy to solve. Simply use the abstraction layer I talked about
> above. It's good to put in a date too because on the practical Web,
> pages go bad and change. That Vietnam War page may be bought out by
> domain squatters and turn into John's Porn Casino Fun Search Engine
> Start Page With New Improved Pop-up Windows, which you probably didn't
> mean to recommend.

When you say
[ is html:representation of <URI>; dc:date "[DATE]" ].
you have a problem becasse I can replace <URI> with any
otehr symbol whcih is equivalent.  As all the URIs represneting
Roy Fielding are equivalent, i can use any one.


> I don't believe in 2.3, 2.4, 2.5, 2.6 or 2.7

Good

>  and the last few sound sort
> of like straw man arguments.
>
> > Secondly, the HTTP protocol actually does have methods of retrieving
> > parts of a large document.
>
> Only if you no the byte locations of the parts you want, which is
> extremely unlike, especially with a changing document.
>
> Here are a few FAQs for you to answer:

Thanks for your input. I ahve put them in the document in the FAQ section.

> Q: Can you point to something in the spec that says HTTP URIs must
> identify a document? Isn't it a little weird to start making
> pronouncements about the entire HTTP Web when neither the spec nor the
> other TAG members agree?
>
> Q: Why do we need to use URI-refs to identify abstract concepts in a
> protocol where we can get more information about them? I thought URIs
> were doing just fine. If we have to resort to UUIDs to identify things,
> I'll get annoyed because I won't be able to put them in my browser.
>
> Q: How can you say that the Semantic Web can use the hash mark to make a
> URI-ref identify anything when the URI RFC is very clear that hash marks
> only work when you dereference the document. Are all Semantic Web agents
> going to start dereferencing every document they hear about? Isn't the
> Semantic Web broken if we have to start disagreeing with major
> specifications like this?
>

Tim BL

> --
> Aaron [http://www.aaronsw.com] 4FAC4838B7D8D13FA6D92EDB4145521E79F0DF4B
>
Received on Wednesday, 31 July 2002 16:48:16 UTC