Re: httpRange-14 Change Proposal from Nathan on 2012-03-29 (www-tag@w3.org from March 2012)

From: Nathan <nathan@webr3.org>
Date: Thu, 29 Mar 2012 12:48:32 +0100
To: Jeni Tennison <jeni@jenitennison.com>
CC: "www-tag@w3.org List" <www-tag@w3.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <4F744C10.4000602@webr3.org>
Jeni Tennison wrote:
> Nathan,
> 
> The definition of "resource", such that it is, is in RFC3986 [1], which says:
> 
>   This specification does not limit the scope of what might be a
>   resource; rather, the term "resource" is used in a general sense
>   for whatever might be identified by a URI.
> 
> HTTPbis defines "representation" in [2]:
> 
>   A "representation" is information in a format that can be readily
>   communicated from one party to another.  A resource representation is
>   information that reflects the state of that resource, as observed at
>   some point in the past (e.g., in a response to GET) or to be desired
>   at some point in the future (e.g., in a PUT request).
> 
> This is a very loose definition. The first sentence doesn't tell us anything about the relationship of the resource to the representation, the second, that the information "reflects the state of that resource" makes no implication about whether the information is the content of the resource or a description of the resource.

You can think of the term "Resource" as being namespaced within each 
specification, RFC3986-Resource is not equivalent to HTTPbis-Resource. 
Indeed it's this ambiguity in terminology which creates most of the mess 
around httpRange-14.

HTTPbis is written within the context of a transfer protocol, an 
HTTPbis-Resource is any resource which has a property that is an 
instance of the HTTP Protocol. One must dissect HTTPbis in terms of a 
transfer protocol, and not at a broader scope of Web Architecture.

To quote 2.7 of p1-messaging:

    HTTP does not limit what a resource might be; it merely defines an
    interface that can be used to interact with a resource via HTTP

HTTP cannot limit what a resource may be, as the definition of an 
HTTPbis-Resource is a resource which can be interacted with via HTTP.

Since HTTP is an information hiding protocol, the only thing visible for 
an HTTPbis-Resource is: a set of representations/messages associated 
with a URI over time.

Also note that within HTTPbis a "representation" differs from a 
"resource representation". Representation (again ambiguous terminology), 
is used because HTTP has content negotiation, such that at a specific 
point in time an HTTPbis-Resource may be associated with a set of 
representations in different formats which are equivalent. Hence why it 
speaks of "representations" and "A resource representation" rather than 
"the representation" or "the content". This is made clearer when you 
consider representations which are included in response to a POST, in a 
POST request, in a 404 response and so forth, as they clearly are not 
"representations" of things like people, rather they are 
"representations" of the content/information being transferred.

Similarly, terminology such as "reflects" has to be used since the 
current state of an HTTPbis-Resource may be a set of equivalent 
representations, or conversely, each representation reflects the same 
information/content as the other representations in the set.

All HTTPbis-Resources are RFC3986-Resources, since HTTPbis-Resource is a 
subset of RFC3986-Resource, but it does not follow that all 
RFC3986-Resources are HTTPbis-Resources. There is another subset of 
RFC3986-Resources which comprises all those things named by URIs which 
are not HTTPbis-Resources.

Thus:

> For example, I can't see anything in this that precludes a representation of the resource Jeni from having a representation that includes the information that Jeni is 165cm tall.

Assuming that when you say "the resource Jeni" you are referring to 
yourself, then you cannot have an HTTPbis-Representation, since you are 
not an HTTPbis-Resource. You are however an RFC3986-Resource, and an 
rdfs:Resource.

Nothing precludes a HTTPbis-Representation of a description of Jeni from 
including information that Jeni is 165cm tall. But then that isn't the 
issue here, the issue that conflated terminology and failing to consider 
terminology as namespaced and having special meaning within each 
specification creates ambiguity on such a mass scale that we are having 
this discussion on www-tag - and it's gone on for over a decade.

> It is only the httpRange-14 decision [3] that states that a 2XX response to a URI implies that the resource identified by that URI is an information resource. Apparently it was meant to also imply that the representation was the content of that resource, though that wasn't explicitly stated.

In my view, the httpRange-14 decision captures what I've written above 
and then provides guidance on how to indicate that an httpURI refers to 
an RFC3986-Resource which is not also an HTTPbis-Resource. (use 303)

The term "Information Resource" is equivalent to "HTTPbis-Resource".

A 2xx HTTP Response to a GET on an http-URI doesn't "imply" that the 
resource identified by that URI is an information resource, rather the 
2xx responses means that the URI refers to an HTTPbis-Resource. It's 
what determines whether an RFC3986-Resource with an http URI is an 
HTTPbis-Resource or not, the httpRange-14 decision didn't decide this, 
it captured the fact and gave guidance accordingly.

In regards to the implication that "the representation was the content 
of that resource", it is in fact "a current representation", one of the 
set of equivalent resource representations, a representation which 
reflects "the content" (in it's last known state).

> On 28 Mar 2012, at 22:23, Nathan wrote:
>> Additionally, rather than special casing this so that this rule let's a publisher override the default 200 OK return a representation of a resource, the proposal also aims to change web arch and the HTTP specification such that a 200 OK in response to a GET no longer returns a representation of the requested URI, rather it just returns a representation which you must consult to find out what it is.
>>
>> That's quite a large change to the web / web arch / http.
> 
> No. A 200 OK response to a GET still returns a representation of the resource. The only change is to the httpRange-14 decision which layered on the implication that the representation returned in this case was the content of the resource.

As above, it wasn't a decision, it was guidance offered to:

(quote) "enable people to name arbitrary resources using the "http" 
namespace without any dependence on fragment vs non-fragment URIs, while 
at the same time providing a mechanism whereby information can be 
supplied via the 303 redirect without leading to ambiguous 
interpretation of such information as being a representation of the 
resource"- http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html

The whole point is that the response contains a "resource 
representation", not just "some representation", there's no freedom to 
say that it is "a representation of this or that", it simply is a 
"resource representation". Or if it makes it simpler, the response 
simple is "the content", httpRange-14 didn't decide this, that's the way 
it is, if you want to change that, then it doesn't require a change to 
httpRange-14, it requires a change to HTTP.

I believe that last sentence captures everything I want to say 
completely, you're not proposing to change httpRange-14 here, you're 
proposing to change HTTP itself.

Respectfully,

Nathan

>> Sorry no, not *always* just *always could* or *always can*. As in, it would be universally true that for any successful GET request you would receive a representation, and that representation may be a representation of the <target-uri>, or it may be a representation of <some-other-uri> which describes the target-uri.
> 
> No. The representation must still be "information that reflects the state of the target resource", per HTTPbis. But unlike httpRange-14, it goes no further than HTTPbis: in particular, under the proposal there is nothing that says that the representation must be the content of the resource.
> 
>>>> Question A:
>>>>
>>>> Currently we have:
>>>> <http://example.org/uri>; - a JPEG image of a monkey.
>>>>
>>>> When you issue a GET on that URI the server currently responds
>>>> 200 OK
>>>> Content-Type: image/jpeg
>>>> Link: <http://example.org/uri-documentation>;; rel="describedby"
>>>>
>>>> So under this new proposal, the server can return the contents of /uri-documentation with a status of 200 OK for a GET on /uri?
>>> Under the proposal, the server would return the JPEG with a 200 OK for a GET on /uri. http://example.org/uri-documentation would return a description of the JPEG in some machine-readable format. 
>> Or more accurately, the server MAY return the JPEG with a 200 OK for a GET on /uri, or it may return the same result as a successful GET on /uri-documentation (a description of the /uri in some machine readable format).
> 
> Sure, it could. For example, if /uri supported content negotiation and the client requested text/turtle, it could return some text/turtle that described the JPEG image of the monkey.
> 
>> Is this limited to machine readable format, why not human readable too?
>>
>> It appears that if one can return text/turtle for a GET request on </foo>, where { </foo> a :Horse } then one should also be able to return an image/jpeg which visually describes the horse.
> 
> Yes, under the proposal that would be OK, so long as the picture was information that reflected the state of the resource, per HTTPbis.
> 
>>>> Question B:
>>>>
>>>> How would conneg work, and what would the presence of a Content-Location response header mean? Would HTTPBis need to be updated?
>>> I can't see any way in which any of that would work differently from currently.
>> Okay, given the use-case of a GET on </uri> returning 200 OK, and the response containing a representation of </uri-documentation> in text/turtle:
>>
>> What would the value of the Content-Location header be? /uri-documentation?
> 
> If the response to /uri was the same representation as /uri-documentation, then yes, the Content-Location header could point to /uri-documentation. That seems consistent with the definition of Content-Location at [4].
> 
>> short version: this proposal would mean many sections of httpbis would need to be reworded and changed, as it conflicts to the point of saying the opposite.
> 
> What parts do you think would need to be reworded?
> 
>>>> Question C:
>>>>
>>>> Currently 303 "indicates that the requested resource does not have a representation of its own that can be transferred by the server over HTTP", and the Link header makes it clear that you are dealing with two different things (/uri and /uri-documentation), but where does this proposal make it clear at transfer protocol level that the representation included in the http response is a representation of another resource which describes the requested resource (rather than it being as the spec defines "a representation of the target resource")?
>>> The proposal says that applications can draw no conclusions from information at the transfer protocol level about /uri. In particular, it can't tell whether the representation that is returned with /uri is *the content* of /uri or *the description* of /uri. Further information about /uri (eg that it is a foaf:Person) may help the application work out that the representation was *a description*.
>> Wow, so every URI no longer refers to anything unless it's explicitly stated in some RDF somewhere, and if one looks up <b> in a browser and sees a picture of a monkey, they are incorrect for saying it refers to a picture of a monkey if some RDF document somewhere describes <b> as a :SpaceShuttle.
>>
>> Can the TAG really just say "okay, all http:// URIs no longer refer to anything"?
> 
> This proposal does not change the fact that URIs identify resources, nor that the representation of a resource is "information that reflects the state of that resource".
> 
> What you can't tell just by looking at <b> in a browser and getting a picture of a monkey is whether the resource for which that picture is a representation is a real life monkey or the picture itself. Conversely, httpRange-14 enabled you to say that <b> definitely identified the picture itself, or rather something whose content, as an image/jpeg, was that picture.
> 
>>> However, an application can draw conclusions about /uri-documentation, assuming it gives a 2XX response, because it has been retrieved as the result of following a :describedby link (or if it were the target of a 303 redirection). The application can tell that the representation from /uri-documentation is *the content* of /uri-documentation and *the description* of /uri.
>> I can't see how it could tell that "the representation from /uri-documentation is *the content* of /uri-documentation and *the description* of /uri". Perhaps that it's *a* description of /uri, but certainly not that it's "the content of /uri-documentation", the proposal itself removes all notion of a representation being a representation of the current state of the requested uri.
> 
> If anything that I have said or the wording of the proposal makes you think that the proposal removes all notion of a representation being information that reflects the past state of the resource identified by the requested URI, I apologise. That is not the intention at all. It would be very helpful if you could point out what part of the proposal wording makes you think that so that I can correct it.
> 
>> if <a> is described by <b>, and <b> is described by <c>, then a GET on <a> can now return <b>, whilst a get on <b> can return <c>, and so forth, and if that :describedby triple is missing, or you don't get back RDF in some form, then you don't know what you retrieved or if the requested uri refers to it at all.
> 
> A GET on <a> results in a representation of <a>, which may be the content of <a> or a description of <a>. That representation may further say that <a> is described by <b>; if it does so, the representation of <b> must be the content of <b> (and must include within it a description of <a>).
> 
> You always know that the representation that you get from <a>, <b> and <c> is the representation of that resource.
> 
>>>>> Either way, there is no implication that what you've got from http://example.org/uri is the content of http://example.org/uri (or that http://example.org/uri identifies an information resource), but there is an implication that what you get from http://example.org/uri-documentation is the content of http://example.org/uri-documentation (and that http://example.org/uri-documentation is an information resource).
>>>> Sorry I don't follow, how is there an implication from a 200 OK for <uri-a> that it's not an IR and for <uri-b> that it is an IR?
>>> Because /uri-documentation was reached through a :describedby link. This extra information allows the application to draw the conclusion that the representation from /uri-documentation is *the content* of /uri-documentation.
>> and when you don't reach it via a ":describedby" link (as in 99.99% of cases on the web)? also see above, same points.
> 
> Yes, in most cases on the web today, under the proposal we cannot tell whether the representation you get is a description of the resource or the content of the resource. Different people have different assumptions about whether the representations on the web today are, in the main, the content of the resources identified by the URIs that are used or descriptions of the resources identified by the URIs that are used.
> 
> For example, does http://www.whitehouse.gov/ identify the real-world resource "The Whitehouse" or the abstract resource that is defined as the set of all possible representations retrievable from that URI? Looking from the outside, it is impossible to tell what the owners of that URI intend it to identify; all we can really tell is that the representations we get reflect the state of whatever that resource is.
> 
>> Apologies but I have to disagree completely here, I can say I'm a goldfish but I have the properties of a human and belong in the Set of Humans, no matter how much I say, I'm never going to be a goldfish - there's no design choice there, similarly if something a representation of something was retrieved via HTTP, then it belongs to the set of things which can have their representations retrieved via HTTP, that just is a fact, not a design decision.
>>
>> Sorry this appears so negative, but... well the above hopefully explains, personally I see it as ripping the foundational constraints of the web/uris/http away to try and save an extra GET request in a few cases.
> 
> 
> I hope I've explained why I really don't think it does anything of the sort.
> 
> Cheers,
> 
> Jeni
> 
> [1] http://tools.ietf.org/html/rfc3986
> [2] http://tools.ietf.org/html/draft-ietf-httpbis-p3-payload-19#section-4
> [3] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
> [4] http://tools.ietf.org/html/draft-ietf-httpbis-p3-payload-19#section-6.7
Received on Thursday, 29 March 2012 11:49:36 UTC