Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...] from Pat Hayes on 2011-06-12 (public-lod@w3.org from June 2011)

From: Pat Hayes <phayes@ihmc.us>
Date: Sun, 12 Jun 2011 10:19:51 -0700
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org>
Message-Id: <9B7A654F-CB83-49E2-AD66-AFA69D6D08BA@ihmc.us>
On Jun 12, 2011, at 5:40 AM, Danny Ayers wrote:

> On 12 June 2011 01:51, Pat Hayes <phayes@ihmc.us> wrote:
>> 
>> On Jun 11, 2011, at 12:20 PM, Richard Cyganiak wrote:
>> 
>>> ...
>>>>> It's just that the schema.org designers don't seem to care much about the distinction between information resources and angels and pinheads. This is the prevalent attitude outside of this mailing list and we should come to terms with this.
>>>> 
>>>> I think we should foster a greater level of respect for representation
>>>> choices here. Your dismissal of the distinction between information
>>>> resources and what they are about insults the efforts of many
>>>> researchers and practitioners and their efforts in domains where such
>>>> a distinction in quite important. Let's try not to alienate part of
>>>> this community in order to interoperate with another.
>>> 
>>> Look, Alan. I've wasted eight years arguing about that shit and defending httpRange-14, and I'm sick and tired of it. Google, Yahoo, Bing, Facebook, Freebase and the New York Times are violating httpRange-14. I consider that battle lost. I recanted. I've come to embrace agnosticism and I am not planning to waste any more time discussing these issues.
>> 
>> 
>> Well, I am sympathetic to not defending HTTP-range-14 and nobody ever, ever again even mentioning "information resource", but I don't think we can just make this go away by ignoring it. What do we say when a URI is used both to retrieve, um sorry, identify, a Web page but is also used to refer to something which is quite definitely not a web page? What do we say when the range of a property is supposed to be, say, people, but its considered OK to insert a string to stand in place of the person? In the first case we can just say that identifying and reference are distinct, and that one expects the web page to provide information about the referent, which is a nice comfortable doctrine but has some holes in it. (Chiefly, how then do we actually refer to a web page?) But the second is more serious, seems to me, as it violates the basic semantic model underlying all of RDF through OWL and beyond. Maybe we need to re-think this model, but if so then we really ought to be doing that re-thinking in the RDF WG right now, surely? Just declaring an impatient agnosticism and refusing to discuss these issues does not get things actually fixed here.
> 
> For pragmatic reasons I'm inclined towards Richard's pov

Well, I am too. That is, I would love for this whole issue/problem to just go away. But I don't think ignoring it will make it go away. 

> , but it would
> be nice for the model to make sense.
> 
> Pat, how does this sound:
> 
> From HTTP we get the notions of resources and representations. The
> resource is the conceptual entity, the representations are concrete
> expressions of the resource. So take a photo of my dog -
> 
> <http://example.org/sasha-photo> foaf:depicts <http://example.org/Sasha> .
> 
> If we deref http://example.org/sasha-photo then we would expect to get
> a bunch of bits that can be displayed as an image.
> 
> But that bunch of bits may be returned with HTTP header -
> 
> Content-Type: image/jpeg
> 
> or
> 
> Content-Type: image/gif
> 
> Which, for convenience, lets say correspond to files on the server
> called sasha-photo.jpg and sasha-photo.gif
> 
> Aside from containing a different bunch of bits because of the
> encoding, sasha-photo.jpg could be a lossy-compressed version of
> sasha-photo.gif, containing less pixel information yet sharing many
> characteristics.
> 
> All ok so far..?
> 
> If so, from this we can determine that a representation of a resource
> need not be "complete" in terms of the information it contains to
> fulfill the RDF statement and the HTTP contract.

OK, so far. I would just note that (coming from a different, non-HTTP, tradition) I would never have even dreamt of any representation being "complete" in what I think is the sense you mean. So your care and emphasis here seem odd. But OK, I am following you...

> 
> Now turning to http://example.org/Sasha, what happens if we deref that?
> 
> Sasha isn't an information resource, so following HTTP-range-14 we
> would expect a redirect to (say) a text/html description of Sasha.

Really? I thought that HTTP-range-14 just said that if we get redirected, all bets are off, and the URI might denote anything at all, so the thing that gets returned might have nothing to do with the referent. 

> 
> But what if we just got a 200 OK and some bits Content-Type: text/html ?

Then (again, according to doctrine) the URI denotes the information resource which this is the HTTP-representation of. Which evidently is not Sasha.

> 
> We are told by this that we have a representation of my dog, but from
> the above, is there any reason to assume it's a complete
> representation?

No, but what has that got to do with anything? The key issue is that we are told that it is an information resource and hence we know it is not a dog. So we know, for example, that if someone asserts that some other dog is its father, or that it had its vet shots in February, or that it is an instance of http://sw.opencyc.org/concept/Mx4rvVjaoJwpEbGdrcN5Y29ycA , then (if we are smart) something is wrong here, or else (if we are less smart) that something on the Web has these properties. 

Now, we could try this line, which I think is what you are suggesting. We could say that all such 'information resources' are being used as stand-ins for referential names themselves, i.e. they are not things (like dogs, say) but should always be understood as referring to some other thing. There are some technical problems with this, but Im sure we could work around them; but the serious problem with this idea is, that it makes it impossible to simply refer to these information resources themselves. So we would be unable to talk about Web pages using the Web description language RDF. Frankly, this would not bother me personally very much, as I am not particularly interested in describing Web pages in RDF, but I know it would bother some other people (TIm B-L, for just one) rather a lot. 

> The information would presumably be a description, but is it such a
> leap to say that because this shares many characteristics with my dog

What??

> (there will be some isomorphism between a thing and a description of a
> thing, right?

Absolutely not. Descriptions are not in any way isomorphic to the things they describe. (OK, some 'diagrammatic' representations can be claimed to be, eg in cartography, but even those cases don't stand up to careful analysis. in fact.) 

> ) that this is a legitimate, however partial,
> representation?

It is a representation, sure. The question is, what is it a representation OF? A lossy image of a lossy image of X is itself a (very) lossy image of X. But the name of a name of X is not a name of X; and a (descriptive) representation of a representation of X is not a representation of X. For example, "written clumsily and with many spelling errors" describes "Ee were real gude at mafematiks at skool", which in turn describes me; but I am not, myself, composed of spelling errors. Reference is not transitive, in a nutshell.

> 
> In other words, what we are seeing of my dog with -
> 
> Content-Type: text/html.
> 
> is just a very lossy version of her representation as -
> 
> Content-Type: physical-matter/dog

Nope, absolutely not. Reference is not like lossy imaging. 

> Does that make (enough) sense?

NIce try, but no cigar. Want to try again? Seriously, it is not easy to find a coherent way to allow what one might call reference slippage - using a name or description to stand in for the actual thing named - without the whole semantic framework just basically collapsing**. I know we humans do it all the time without hardly noticing, and I REALLY wish that I or someone could figure out how to capture this facility in a formal scheme of some kind. But I cant see how to do it.

Pat

** To illustrate. Someone goes to a website about dogs, likes one of the dogs, and buys it on-line. He goes to collect the dog, the shopkeeper gives him a photograph of the dog. Um, Where is the dog? Right there, says the seller, pointing to the photograph. That isn't good enough. The seller mutters a bit, goes into the back room, comes back with a much larger, crisper, glossier picture, says, is that enough of the dog for you? But the customer still isn't satisfied. The seller finds a flash card with an hour-long HD movie of the dog, and even offers, if the customer is willing to wait a week or two, to have a short novel written by a well-known author entirely about the dog. But the customer still isn't happy. The seller is at his wits end, because he just doesn't know how to satisfy this customer. What else can I do? He asks. I don't have any better representations of the dog than these. So the customer says, look, I want the *actual dog*, not a representation of a dog. Its not a matter of getting me more information about the dog; I want the actual, smelly animal. And the seller says, what do you mean,  an "actual dog"? We just deal in **representations** of dogs. There's no such thing as an actual dog. Surely you knew that when you looked at our website? 

> Cheers,
> Danny.
> 
> 
> 
> 
> -- 
> http://danny.ayers.name
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Sunday, 12 June 2011 17:20:40 UTC