[httpRange-14] What do HTTP URIs Identify? from Aaron Swartz on 2002-07-29 (www-tag@w3.org from July 2002)

From: Aaron Swartz <me@aaronsw.com>
Date: Mon, 29 Jul 2002 11:12:26 -0700
To: Tim Berners-Lee <timbl@w3.org>
Cc: www-tag@w3.org
Message-Id: <BCFE356A-A31E-11D6-AF33-0003936780B2@aaronsw.com>
I got some bits upon dereferencing 
http://www.w3.org/DesignIssues/HTTP-URI that claim to have been written 
by TimBL. Some of them looked like:
> The document itself is an important part of society - to dismiss its 
> existence is to prevent us being aware of human and aspects of 
> information without which we are impoverished.

This seems to be the main problem with your argument. By claiming that 
HTTP URIs can represent abstract things, we are not dismissing the 
document! HTTP URIs can identify documents perfectly well, if you please.

> If we stick with the principle that a URI (or URIref) must 
> unambiguously identify the same thing in any context, then we come to 
> the conclusion that URIs can not identify the web page. If a web page 
> is about a car, then the URI can't be used to refer to the web page.

Ummm... huh? Just because a URI can identify a car doesn't mean it can't 
identify a web page. If you want to know what the URI identifies you 
need to ask the publisher. If it identifies a car, then use RDF to say 
something like "the HTML representation I got from <URI> on <DATE>" -- 
[ is html:representation of <URI>; dc:date "[DATE]" ].

In your view, URIs can't identify donkeys. I think we all agree donkeys 
are very important and I think it's awful you're trying to write them 
out of the Web Architecture. I'm going to call up the People for the 
Ethical Identification of Animals.

> The problem with this is that there are a large number of systems which 
> already do use URIs to identify the document. This is the whole 
> metadata world.

Well, most of this metadata is written by the creators of the page. If 
they say the page identifies a document and that document has a dc:title 
or an rss:title of such then that's fine with me. I'm not saying HTTP 
URIs *have* to represent donkeys, merely that they can. I don't see a 
problem here.

> * The HTTP headers

Most HTTP headers, as I believe I pointed out in the Expires: example, 
apply to the entity or representation, not the Resource. I believe those 
that apply to the Resource work just fine with it being a donkey.

> You can argue that a web page indirectly identifies something, of 
> course, and I am quite happy with that.

And, as I show above, you can indirectly talk about the web page using 
the URI to identify the thing. This works fine and requires less 
constraints on web architecture than your proposal.

> Conclusion so far: the idea that a URI identifies the thing the 
> document is about doesn't work because we can only use a URI to 
> identify one thing and we have and already do use it to identify 
> documents on the web.

Again, you are confusing the ability for a web page to identify a donkey 
with the requirement that it does. I would argue that the web page only 
identifies the donkey if one was careful to state that it did. An 
example of such a page is: http://logicerror.com/myWeavingTheWeb

I think adding the new Repr-Type and Resource-Type headers that Sean B. 
Palmer (I think) proposed would be helpful for this sort of thing. That 
way I could make clear that the page was a physical book in an HTTP 
sense.

Hm, maybe you cover this in view 2.2... Let's try that.

> I read a web page, I like it and I am going to annotate it as being a 
> great one -- but first I have to find out whether the URI my browser is 
> used, conceptually by the author of the page, to represent some 
> abstract idea? Before I recommend the Vietnam War page, I have to be 
> careful I am not recommending the Vietnam War.

This is easy to solve. Simply use the abstraction layer I talked about 
above. It's good to put in a date too because on the practical Web, 
pages go bad and change. That Vietnam War page may be bought out by 
domain squatters and turn into John's Porn Casino Fun Search Engine 
Start Page With New Improved Pop-up Windows, which you probably didn't 
mean to recommend.

I don't believe in 2.3, 2.4, 2.5, 2.6 or 2.7 and the last few sound sort 
of like straw man arguments.

> Secondly, the HTTP protocol actually does have methods of retrieving 
> parts of a large document.

Only if you no the byte locations of the parts you want, which is 
extremely unlike, especially with a changing document.

Here are a few FAQs for you to answer:

Q: Can you point to something in the spec that says HTTP URIs must 
identify a document? Isn't it a little weird to start making 
pronouncements about the entire HTTP Web when neither the spec nor the 
other TAG members agree?

Q: Why do we need to use URI-refs to identify abstract concepts in a 
protocol where we can get more information about them? I thought URIs 
were doing just fine. If we have to resort to UUIDs to identify things, 
I'll get annoyed because I won't be able to put them in my browser.

Q: How can you say that the Semantic Web can use the hash mark to make a 
URI-ref identify anything when the URI RFC is very clear that hash marks 
only work when you dereference the document. Are all Semantic Web agents 
going to start dereferencing every document they hear about? Isn't the 
Semantic Web broken if we have to start disagreeing with major 
specifications like this?

--
Aaron [http://www.aaronsw.com] 4FAC4838B7D8D13FA6D92EDB4145521E79F0DF4B
Received on Monday, 29 July 2002 14:12:24 UTC