Re: Use cases from Jonathan from Jonathan Rees on 2008-05-01 (public-awwsw@w3.org from May 2008)

From: Jonathan Rees <jar@creativecommons.org>
Date: Thu, 1 May 2008 10:36:54 -0400
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-Id: <43A362C5-7696-4C4E-9ECA-C4FD4373FB98@creativecommons.org>
On Apr 29, 2008, at 1:11 AM, Booth, David (HP Software - Boston) wrote:
>> From: Jonathan Rees [mailto:jar@creativecommons.org]
>> [ . . . ]
>> Here are some of the cases I'm thinking of:
>
> I'll first just answer these in prose, since I can do that more  
> quickly than in N3.  I'll be using ftrr:IR definition of  
> "information resource" at
> http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html
>
>>
>> 1. I have two URIs X and Y. By varying Accept-Language I learn that I
>> can retrieve French and Spanish variants of something via URI X, and
>> Spanish and German variants of something via URI Y. The Spanish
>> variants retrieved via X and Y are the same. All responses are 200s.
>>
>>         - Is it possible that X and Y denote the same thing?
>
> Yes, X and Y can denote the exact same ftrr:IR.  Bear in mind that  
> one of the parameters to an ftrr:IR function is the request, and  
> the request includes the URI of the resource whose representation  
> is being requested, hence there is no way to use deferencing alone  
> to reliably establish that two URIs denote different ftrr:IRs.    
> This reflects the actual capabilities of Web servers.

OK. I would say this answer disagrees with the orthodox view that IRs  
are abstract documents or abstract information (see below). By  
definition of "abstract" an abstract document doesn't know how it is  
represented; choices of representation are a deployment detail. This  
is not just my interpretation; it agrees, as far as I can discern,  
with the answer Tim gave to a similar question of mine recently (in  
Vancouver?).

>>         - Is it possible that X and Y do *not* denote the same thing?
>>             (assuming that responses are known to be time invariant.)
>
> Yes.  Since the set of <Time, Request, Representation> tuples that  
> make up a ftrr:IR can be infinite, dereferencing two URIs cannot  
> definitively establish that they denote the same ftrr:IR.  They  
> could denote ftrr:IRs that differ in <Time, Request,  
> Representation> tuples that your dereferencing did not test.
>
>>         - Is it necessary that X and Y do not denote the same thing?
>
> No, as explained above.

Let me refine the scenario to better fit my intent to your  
definition. I meant to imply that we might have applied the same  
request at the same time to the two different URIs, and gotten  
different responses (say, a request with accept-language of French  
yielding a French response in one case, Spanish in the other because  
the requests via the second URI yield no French responses). In this  
case it would be *necessary* that X and Y do not denote the same  
thing, by your definition.

In other words: two ftrr:IRs can differ even if they carry identical  
abstract information.  This IR = variable information idea is what I  
thought was Tim's model, and AWWW's, and Pat's, and what I was trying  
to capture in my diagram.

I'm not advocating for or against the information-carrying view, and  
I certainly respect the definiteness of ftrr:IR. But it needs to be  
clear that there are two different notions here, and that the  
difference is consequential.

>> 2. Suppose that the values I retrieve (in different languages, say)
>> via a URI X say contradictory things - for example, one says that
>> Rome is the capital of Italy, and another says that Paris is the
>> capital of Italy.
>>
>>         - Does X denote an information resource, given that the  
>> values
>> cannot both be representations of the same information?
>
> Yes, X still denotes an ftrr:IR, even though it apparently violated  
> the AWWW principle that each language-specific representation  
> carries the same abstract information.  Bear in mind that on the  
> Web, anyone can say anything about anything, including making  
> statements that are false.  In this case, the abstract information  
> that was carried was the assertion that both Paris and Rome are the  
> capital of Italy.  The assertion happens to be false, but its  
> falsity does not change the fact that X denotes an ftrr:IR.

So you are saying that the idea that IRs are related to information  
is not part of the definition of IR (as it is in AWWW), but rather  
just some sort of good practices recommendation.  That's fine with  
me, but the problem of defining when these good practices are being  
followed would remain, and any model of good practice would induce a  
subclass of ftrr:IR consisting of those ftrr:IRs compatible with  
these good practices. We'd still have the question of whether we want  
to do a model of good practice, or give up.

It sounds like in your model that given a collection of simultaneous  
responses, there is no way to detect violation of the must-carry-the- 
same-information principle, since by plausible deniability the server  
can always claim that that each response is a subset of the  
(inconsistent) union of all the information in all the responses.  
This makes "carry the same information" almost tautologous and  
therefore a bit out of the spirit of conneg. (I say "almost" because  
there may be other ways to find inconsistencies, e.g. by reading and  
believing metadata about the IR.) I don't think this could have been  
what was intended.

(I can't find this principle in AWWW by the way - can you point me to  
the correct passage? Maybe you're thinking RFC2616?)

>>         - If so, does it denote a "bad" information resource?
>
> I'm not sure what you mean by bad.  The ftrr:IR is perfectly fine  
> as an ftrr:IR, but it happens to convey false information.

OK.  "bad" = does not follow good practice recommendations, as above.

>>         - If not, what does it denote, if anything?
>>         - Assume unchanging whatever if necessary in order to
>> make these questions nontrivial.
>>
>> 3. Suppose I set up a web server responding to requests for some URI
>> X as follows:
>>         - A URI for an IR on the web is chosen at random and
>> a value is fetched using that URI
>
> Let's call this second URI Z, and assume Z != X.
>
>>         - The value is returned as the payload of a 200 response
>
> Okay, so when X is dereferenced, a representation from Z is  
> returned in a 200 response, right?

That one time, yes. On subsequent requests it would be Z2, Z3, ...
>
>> Questions:
>>         - Does X denote an information resource?
>
> Yes.  The dereference of X resulted in a 200 response, therefore X  
> denotes an ftrr:IR.
>
>>         - If so, what information do its referent's
>> representations represent?
>
> A random representation chosen from the Web.

No, that's what the representation *is*, not what it *represents*.   
But representation (in any sense other than the trivial got-a- 
response sense) has disappeared from your account, as has  
information, so the question is moot. Nothing wrong with that, just  
different from the other model(s).
>
> It sounds like you may be trying to view multiple representations  
> as (perhaps lossy) encodings of some abstract information.  That  
> view only applies to content negotiation, which is only *one*  
> possible use of the Request parameter of an ftrr:IR function:
>
>   f: Time x Request --> Representation

"Some abstract information" - yes, that's just what I thought the IR  
idea was supposed to capture. According to what I thought was the  
orthodox view, the notion that a response represents something else  
applies always, not just when one chooses between representations. To  
have a 200 response that does not represent something is considered  
not "good practice" and a threat to accessibility (multiple  
languages, formats, sensory modes). This is why you're never supposed  
to give an IR a URI ending with .html.  (Whether the information-is- 
abstract constraint is part of any protocol, or could be verified in  
an audit, is a different question.) (Again, I'm neutral on this point  
of view, just reporting that I have heard it.)

Dividing IRs into those "subject to content negotiation", as RFC2616  
does, and those that aren't would be an interesting way to go.  
Certain good practices might apply to one set that don't to the  
other. I hadn't thought of that. Maybe this is like language and  
"representation" invariance from http://www.w3.org/DesignIssues/ 
Generic.html .

>>         - If not, what could X's referent be, if it has one?
>> Is it a "bad" information resource, or something else?
>
> There is nothing wrong with it as an ftrr:IR, but whether you find  
> it useful is up to you.

See above... I hear Tim as saying "don't do it". Useful-to-you is  
akin to saying that the web is self-correcting and doesn't need any  
good practice recommendations. I think this is similar to what  
Xiaoshu has been saying. I am undecided.
>
>>         - Is the web site behaving within the limits
>> specified by RFC2616 and/or AWWW?
>
> Yes, provided we assume that AWWW adopts the ftrr:IR definition of  
> "information resource".

By AWWW I mean the 15 December 2004 recommendation. That particular  
version will never adopt anything, I hope.

My reading is that my randomly-responding URI can't denote an IR  
because it has no "essential characteristics" and/or bears no  
resemblance to anything that might be called "variable information"  
or "abstract information". If it does, then it makes all of these  
notions tautologous, and we might as well stop talking about them. I  
think that may be your suggestion.
>
>>
>> All these questions can be expressed in RDF. I can come up with more
>> cases like this, if you like, but I think you get the idea so let's
>> start with these.
>
> Let me know if you still want N3 answers to these.

No thanks, I think we're communicating well enough.

Jonathan
>
>
>
> David Booth, Ph.D.
> HP Software
> +1 617 629 8881 office  |  dbooth@hp.com
> http://www.hp.com/go/software
>
> Opinions expressed herein are those of the author and do not  
> represent the official views of HP unless explicitly stated otherwise.
Received on Thursday, 1 May 2008 14:37:41 UTC