Re: statements about resources vs. representations from Jonathan Rees on 2008-11-21 (public-awwsw@w3.org from November 2008)

From: Jonathan Rees <jar@creativecommons.org>
Date: Fri, 21 Nov 2008 15:17:58 -0500
To: Harry Halpin <hhalpin@ibiblio.org>
Cc: Alan Ruttenberg <alanruttenberg@gmail.com>, "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-Id: <1696A176-0399-41AE-828E-4FF6C03490E2@creativecommons.org>
On Nov 21, 2008, at 1:39 PM, Harry Halpin wrote:

> Alan Ruttenberg wrote:
>> AFAIK there is no way to make a statement about a representation,  
>> only
>> about a resource. Therefore we can not evaluate the truth of  
>> something
>> like a statement involving containsWord solely by looking at
>> representations.
>>
> Furthermore, I was under the impression that this inability to speak
> about representations is a "feature", not a bug, since one assumes  
> that
> representations in of themselves are too ephemeral for someone to  
> *want*
> to make statements about them.

But AWWW and RFC 2616 talk about representations, and we do too.
We have to in order to talk about the semantics of HTTP.
Just because some people don't want to talk about them doesn't mean
such talk should be prohibited - that would be unscientific and  
undemocratic.

And of course they're exceedingly important. If all representations
were to disappear from the earth, the Web would cease to exist. So
it seems not only permissible, but important to talk about them.

The intermediate states in the AMD K5 are even more ephemeral than
representations, but they are named and reasoned about using formal  
language,
and participate in formal proofs. I thought this was the kind of thing
RDF and OWL were supposed to be for? Maybe they don't have as rich a  
proof
theory as ACL2, but the whole reason I've been attracted to them is that
they are *general purpose* formal languages. So nothing should be off  
limits.

On the other hand, I introduced the notion of a fixed resource
defined by a representation to appease those who hate making
a representation the subject of a triple. Since representations
and fixed resources are "isoontic" (David MacAllester's term),
and are even confused in Tim's memo, I don't see what the big
deal is, but there's no need to aggravate anyone by using
representations when fixed resources will do.

>  More below:
>> -Alan
>>
>> On Fri, Nov 21, 2008 at 8:58 AM, Jonathan Rees <jar@creativecommons.org 
>> > wrote:
>>
>>> (Using "representation" in the AWWW sense here.)
>>>
>>> Suppose I have a resource R, and for some reason I believe that
>>> R dc:creator author:Charles_Dickens.
>>>
>>> Now suppose that I do a GET to obtain a representation, and let F be
>>> the fixed resource (see [1]) whose representation is this  
>>> representation.
>>> (I'll need a term for the coercion of representation to fixed  
>>> resource, so
>>> I'll say "the FR of the representation.")
>>>
>>> Assuming good faith and proper functioning on everyone's part,
>>> can I conclude that F dc:creator author:Charles_Dickens . ?  I  
>>> suspect
>>> so, but is this idea codified anywhere? Wouldn't this be part of  
>>> AWWSW?
>>>
> You have no choice, as you can't talk about the representation.

You seem quite sure of this. What is your evidence for this statement?

>>> It seems to me that some properties will be shared between a  
>>> resource
>>> and its representations' FRs, while others aren't.
>>>
> Ah, this is a problem, one I think the HTTP in RDF draft is working  
> on.

Hmm, I missed that, will look.

>>> E.g. a property containsWord could easily be true of one  
>>> representation
>>> but not another (e.g. if the representations differ by language).  
>>> Or,
>>> more obviously,
>>> one can meaningfully talk about the media type and content-length  
>>> of a FR,
>>> but not necessarily of its originating resource. Volatility is  
>>> similar: the FR
>>> is by definition not time-varying, but the resource may be.
>>>
>>> I guess this is what Tim's "generic resources" memo [1] is saying.
>>>
>>> Are there any properties of a resource that can be inferred
>>> from its representations? That is, when I do a GET, do I
>>> (or rather a stupid automated agent) learn anything
>>> at all about what the resource is? I certainly don't learn anything
>>> about, say, volatility, unless we're lucky enough to have
>>> a credible assertion about it in the representation.
>>> But I would guess that at least for things like authorship
>>> (aspects of the content), if P and Q are disjoint classes,
>>> and P applies to a resource's representation's FR, then you can  
>>> conclude that
>>> Q does not apply to the resource? That is, if you find that
>>> any representation's FR's creator list consists of {George Eliot},  
>>> then
>>> you know that the resource's creator list cannot be {Charles  
>>> Dickens}.
>>>
> It would seem like one has no choice but to infer the resource from  
> the
> representations!

According to my reading of AWWW, it's up to a URI's designated naming  
authority to determine what the URI denotes. (Well, "denotation" isn't  
objective, so really what this is saying is that the community is  
requested to respect, as best it can, what the naming authority has to  
say on the question of what the URI names.) So you should be able to  
find out what the URI names by establishing communication with the  
naming authority.  We know that HTTP metadata is authoritative [1], so  
HTTP might be one way to get information from the naming authority,  
but there could be other ways, such as calling them on the phone.

If what you mean is that a[n information] resource is defined by its  
representations - that you can't have two different resources that  
have the same representations - that would be a very interesting and  
powerful statement, similar to what David Booth has been advocating.  
But how would one come to know what its representations are? Doing  
GETs will only get you *some* of the representations, and some aspects  
of the resource, such as its fixedness (if it's fixed), depend on  
knowledge of *all* of its representations, many of which may not exist  
until you and I are dust. In general, the naming authority may have  
opinions about the meaning (referent) of the URI that are completely  
consistent with all of the resource's observed representations, but  
are not a consequence of them.

Also, how would one account for POST? Resources X and Y could have  
identical representations for all GET requests, but still differ in  
that a POST to X would have a different effect than a POST to Y.  
Certainly X and Y are not the same resource in this case, so GETs do  
not the resource make. (You might have plausible deniability by  
claiming that X and Y differ by some representation not obtainable via  
GET, but I don't think you'd be taken seriously.)

One could conclude that there is no knowing anything about any  
information resource other than that some set of observed entities are  
representations of it, but this seems awfully cynical. I prefer the  
AWWW view, that the naming authority knows. This allows us to ask,  
what would the NA say, and what would we rather it not say, if it  
could speak RDF? And by extension, other speakers.

There has to be some intersection between common sense - things like  
Dublin Core in the wild, and Tim's assertion (if it is true) that the  
resource named by http://www.w3.org/DesignIssues/Generic.html is so  
well known that it needs no explanation - and HTTP semantics. If we  
can't explain why, or when, it is legitimate to use http://purl.org/dc/terms/creator 
, in a manner that's consistent with at least some of its uses in the  
wild, then ... I will send unfriendly email.

Best
Jonathan

>>> This doesn't hold for volatility: volatile and nonvolatile are  
>>> disjoint.

[1] http://www.w3.org/2001/tag/doc/mime-respect.html
Received on Friday, 21 November 2008 20:18:38 UTC