Re: AWWSW homework for 2007-12-11

(Long note warning:  the conclusion is in the last paragraph -- skip there 
if you don't want to read the whole thing.  NM)

Perhaps like Stewart, I find wikis to be on good days moderately effective 
at capturing consensus or current state of a group, but very clumsy as a 
medium for debate and discussion.  So, I'll comment here in email.  Hope 
that's OK.

It's occurred to me that one of the reasons we're struggling with tricky 
cases, such as returning a description of a resource that's not an 
information resource, is that we may not have been clear enough on the 
simple case.   We've gone to some trouble in AWWW to define an Information 
Resource as one which can be effectively captured in a computer message. 
What we haven't said, I don't think, is that representations of IRs in 
fact should be complete, when that's practical.  I'm pretty sure Tim wants 
things to work this way, and I don't think I object, but I can't find any 
pertinent specifications that make it so.

Let's look at some quotes from AWWW: 

> "A representation is data that encodes information about 
> resource state. Representations do not necessarily describe the
> resource, or portray a likeness of the resource, or represent 
> the resource in other senses of the word "represent""

That certainly doesn't come very close to saying "A representation encodes 
the state of an information resource as completely as possible (exceptions 
being made for cases in which particular formats or devices require that 
fidelity be sacrificed)".   Indeed, I see nothing in the AWWW quote that 
wouldn't apply to a non-IR.  A picture of me is at best a very incomplete 
representation of me, but it certainly encodes some of my "state", and 
it's even easier to make the case that it encodes information "about" my 
state, which is what AWWW asks.

Also from AWWW:

> "Assuming that a representation has been successfully 
> retrieved, the expressive power of the representation's format 
> will affect how precisely the representation provider 
> communicates resource state. If the representation communicates
> the state of the resource inaccurately, this inaccuracy or 
> ambiguity may lead to confusion among users about what the resource is."

So, the format may limit fidelity, but one should never lie.  Being 
incomplete is OK.  It's only vaguely implicit that representations should 
be complete when possible.

Under URI Persistence in AWWW [2] we find:

> "As is the case with many human interactions, confidence in 
> interactions via the Web depends on stability and 
> predictability. For an information resource, persistence 
> depends on the consistency of representations. The 
> representation provider decides when representations are 
> sufficiently consistent (although that determination generally 
> takes user expectations into account).
> 
> "Although persistence in this case is observable as a result of
> representation retrieval, the term URI persistence is used to 
> describe the desirable property that, once associated with a 
> resource, a URI should continue indefinitely to refer to that resource."

Again, nothing about completeness or fidelity, just consistency.

OK, let's take a look at RFC 2616 [3].  The definitions of Resource and 
Representation are:

> "Resource: A network data object or service that can be 
> identified by a URI,  as defined in section 3.2. Resources may 
> be available in multiple representations (e.g. multiple 
> languages, data formats, size, and resolutions) or vary in other ways."
> 
> "Representation: An entity included with a response that is 
> subject to content negotiation, as described in section 12. 
> There may exist multiple representations associated with a 
> particular response status."

That seems to leave a lot of latitude on completeness of or fidelity of 
representations of Information Resources.  Now looking at the pertinent 
part of the definition of GET:

> "The GET method means retrieve whatever information (in the 
> form of an entity) is identified by the Request-URI."

Hmm.  That comes pretty close to saying that the resource itself is 
returned, since everyone agrees the URI identifies a resource, and it says 
to "retrieve what's identified by the URI".  A charitable reading of this 
does suggest that completeness might be a good thing, but it's far from 
clear.

> "If the Request-URI refers to a data-producing process, it is 
> the produced data which shall be  returned as the entity in the
> response and not the source text of the  process, unless that 
> text happens to be the output of the process."

Now that seems to indicate that we can return representations not just of 
document-like IRs, but of processes.  In the description of status code 
200 for GET we find:

> "an entity corresponding to the requested resource is sent in 
> the response;"

So, we have "corresponding to", which is yet another description of the 
relationship between a resource and its representation.  FWIW:  I don't 
think any of the above has change significantly in the draft HTTPbis [4]. 
Putting this all together, we seem to have gone to some trouble to:

* In AWWW we carefully define Information Resources as those that can be 
transmitted with good fidelity in a computer message, but we don't in fact 
require that representations of IRs be complete

* In resolution of HTTP Range 14, suggest that status code 200 is only 
appropriate for an information resource (which, FWIW, seems to go somewhat 
beyond but not otherwise contradict what's in RFC 2616), but again say 
nothing about the completeness or fidelity of the representation.

Tim seems to make the obvious connection: if we've gone to all the trouble 
of saying that 200 is only appropriate for resources that can be 
represented with good fidelity in a message, then surely good practice is 
indeed to send representations that convey the resource's state completely 
and with good fidelity.   I can't see anything with any normative force 
that says so.  Should we decide whether this is what we mean, and if so 
find a suitable place to say it (perhaps in a finding for now, and in a 
revision to AWWW eventually?  Ideally HTTPbis should clarify this, but I 
bet that would be hard socially, if not necessarily technically.)  This 
seems crucial to justifying the claim that we can't with 200 return a 
representation such as a picture for a non-information resource.

Noah

[1] http://www.w3.org/TR/webarch/
[2] http://www.w3.org/TR/webarch/#URI-persistence
[3] http://www.ietf.org/rfc/rfc2616.txt
[4] http://ietfreport.isoc.org/idref/draft-lafon-rfc2616bis/

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Thursday, 6 December 2007 22:05:37 UTC