Re: 200 response as conclusive evidence of an information resource from Pat Hayes on 2008-12-05 (public-awwsw@w3.org from December 2008)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 4 Dec 2008 18:17:26 -0600
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: Jonathan Rees <jar@creativecommons.org>, "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-Id: <3E61C3D1-02AE-4488-A210-4F77189A4A65@ihmc.us>
On Dec 4, 2008, at 4:23 PM, Booth, David (HP Software - Boston) wrote:

>
> Hi Pat,
>
>> From: Pat Hayes
>>
>> On Dec 3, 2008, at 10:32 PM, Booth, David (HP Software -
>> Boston) wrote:
>> [ . . . ]
>>> If you relax (or adjust) the definition of
>>> IR to be "things that can yield AWWW:Representations", and if
>>> you agree that a 200 response *is* an AWWW:Representation,
>>> then there is no way that a reasonable person could disagree
>>> with the "200 => IR" rule: it is a tautology.
>>
>>
>> No, its not. What is indeed necessarily true is that the
>> thing that sent the 200 response is an IR (on interpretation
>> (b)). But that doesn't imply that the URI that was used to
>> access it must be interpreted as denoting it. And that is the
>> burden of http-range-14, seems to me: if you get a 200
>> response back from thingie X, then the URI that you used to
>> access X must be interpreted as also denoting X. After all,
>> even if you get a 300 response, the thing that sent it was
>> the same kind of thing that might emit a 200 response, and
>> your URI accessed it in the same way; but in that case, you
>> don't say that the URI denotes it, is the key point.
>
> Good catch.  I vaguely wondered whether I needed to cover that
> point.  :)
>
> AWWW section 2.2 says:
> http://www.w3.org/TR/webarch/#p48
> [[
> [URI] is an agreement about how the Internet community allocates
> names and associates them with the resources they identify.
> . . . .  For example, the "http" URI scheme ([RFC2616]) uses
> DNS and TCP-based HTTP servers for the purpose of identifier
> allocation and resolution. As a result, identifiers such as
> "http://example.com/somepath#someFrag" often take on meaning
> through the community experience of performing an HTTP GET
> request on the identifier and, if given a successful response,
> interpreting the response as a representation of the identified
> resource.
> ]]
>
> Section 2.2 goes on to say: "Of course, a retrieval action
> like GET is not the only way to obtain information about a
> resource. One might also publish a document that purports to
> define the meaning of a particular URI."  But nearly all of
> the 24+ billion pages that are currently accessible on the web
> ( http://www.worldwidewebsize.com/ ) create the URI-to-page
> association by emitting representations, as opposed to
> issuing explicit proclamations saying "henceforth this URI
> shall denote that information resource").  One might claim
> that the association that is created is merely an association
> for *accessing* the page rather than *denoting* the page

And in fact, that is ALL that the pre-semantic Web ever uses it for.  
The Web works on access, not naming. The whole notion of naming and  
denotation isn't needed until we have SWeb descriptions. Which is  
probably why the arrival of the SWeb produced such confusion on this  
particular topic.

> , but
> I don't think that line of argument is very plausible, because
> any name that can be used to access something can also be used
> to denote it, and because we have no other widely used mechanism
> for creating a *denotation* association between URIs and pages.

Very true. In fact, we have (that is, Web architecture has) NO  
mechanism for creating a denotation association between a URI and  
ANYTHING AT ALL. There is absolutely no way to give a thing a name on  
the Web, no computational ceremony of baptism, no way to say 'I hereby  
declare that <name> means <some way of referring to a thing>.'  (The  
'named graph' proposal was exactly that, a baptism protocol for RDF  
graphs, which was at least a start.) The entire Web is name-free, in a  
very real sense. It works not by naming at all, but by access. But  
ever since Larry Masinter tried to justify the idea of URNs, this  
point has been either denied or ignored by the Web savants who write  
the TAG blurbs. URNs were declared to be a universal naming protocol,  
which might be a reasonable idea (actually I think its insane, but  
that's another argument) but only if some way is provided of actually  
making them be names, i.e. a naming protocol. But nothing like that  
ever was proposed. If I want to make up a URN to denote something, say  
my pet cat, what I do actually DO to make the name be a name of my  
cat? Nobody knows. What does it even mean to say that a URN is the  
name of my cat? Nobody knows. If it did mean anything, how can my cat  
have any connection with Web architecture? Nobody knows. Nobody,  
AFAIK, has even faced up to the need to answer questions like this.  
Until they do, ALL assumptions about what URIs denote are just that,  
assumptions. So its up to us (that is, whoever decides these things)  
to decide what conventions we want to adopt about what denotes what.  
Nothing is forced upon us by anything in Web architecture.

Now, all that said, I agree its very natural and simple to assume that  
for ordinary Web pages (or whatever the damn things are called), what  
you retrieve with a GET is what the URI denotes, and this is a large  
part of what has driven the http-range-14 decision. But I would  
observe that this assumption wasn't by any means universally obvious  
to everyone from day one, or else we wouldn't have things like the  
Dublin Core usages to contend with. So maybe its worth keeping an open  
mind about just how absolute this ruling should be. We can't change  
the entire Web, but we could still change the ways that URIs are used  
in RDF to refer, if we can come up with a simple enough way to  
distinguish direct from indirect reference (that is, distinguishing  
between the meanings "I refer to what you use me to GET' and "I refer  
to something described in what you use me to GET".

> So if anything I think the burden of httpRange-14 is the other
> way around: A 200 response is already widely understood as
> indicating that the URI leading to the 200 response denotes
> the thing that yielded the response, but why should a 303 *not*
> lead to that same conclusion?  This is the leap that semantic
> web practitioners are being asked to swallow.

As I understand the decision, it was more a case of: how in God's name  
can we come up with a response that can possibly be understood as not  
having the default 200 "URI refers to the thing GOT" implication? And  
the 303 was just the least bad of a bad set of options. A special new  
response code would have been better, but that wasn't an option for  
basically political reasons. But the point is not that 303  
automatically or naturally carries the required 'cancel the default'  
message, but that it at least could have this content by stipulation  
(after all, this is The TAG Speaking) without anything else breaking.  
Which is not exactly good philosophy, but it is reasonable pragmatic  
engineering.

Pat

>>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 5 December 2008 00:18:45 UTC