Re: httpRange-14 Change Proposal from Jeni Tennison on 2012-03-29 (www-tag@w3.org from March 2012)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 29 Mar 2012 09:35:47 +0100
To: nathan@webr3.org
Cc: "www-tag@w3.org List" <www-tag@w3.org>
Message-Id: <207C5E9A-7ABE-4D64-B835-3A4751BE58F9@jenitennison.com>
Nathan,

The definition of "resource", such that it is, is in RFC3986 [1], which says:

  This specification does not limit the scope of what might be a
  resource; rather, the term "resource" is used in a general sense
  for whatever might be identified by a URI.

HTTPbis defines "representation" in [2]:

  A "representation" is information in a format that can be readily
  communicated from one party to another.  A resource representation is
  information that reflects the state of that resource, as observed at
  some point in the past (e.g., in a response to GET) or to be desired
  at some point in the future (e.g., in a PUT request).

This is a very loose definition. The first sentence doesn't tell us anything about the relationship of the resource to the representation, the second, that the information "reflects the state of that resource" makes no implication about whether the information is the content of the resource or a description of the resource.

For example, I can't see anything in this that precludes a representation of the resource Jeni from having a representation that includes the information that Jeni is 165cm tall.

It is only the httpRange-14 decision [3] that states that a 2XX response to a URI implies that the resource identified by that URI is an information resource. Apparently it was meant to also imply that the representation was the content of that resource, though that wasn't explicitly stated.

On 28 Mar 2012, at 22:23, Nathan wrote:
> In a nutshell then, this proposal says that you can return a 200 OK for a GET request on any URI, but if you return "a representation of a description of the thing referred to by <uri>" rather than "a representation of the thing referred to by <uri>" then you should say it is so by including the special "<uri> :describedby <uri-documentation>" triple.

No. You can return a 200 OK for a GET request on any URI. The proposal doesn't provide any mechanism for explicitly saying that the representation that's returned is a description of the resource, but it does provide a mechanism for pointing to other URIs whose content includes descriptions of the resource. It also provides a mechanism for explicitly saying that the representation is the content of the resource, should you want to do that.

> Additionally, rather than special casing this so that this rule let's a publisher override the default 200 OK return a representation of a resource, the proposal also aims to change web arch and the HTTP specification such that a 200 OK in response to a GET no longer returns a representation of the requested URI, rather it just returns a representation which you must consult to find out what it is.
> 
> That's quite a large change to the web / web arch / http.

No. A 200 OK response to a GET still returns a representation of the resource. The only change is to the httpRange-14 decision which layered on the implication that the representation returned in this case was the content of the resource.

> Sorry no, not *always* just *always could* or *always can*. As in, it would be universally true that for any successful GET request you would receive a representation, and that representation may be a representation of the <target-uri>, or it may be a representation of <some-other-uri> which describes the target-uri.

No. The representation must still be "information that reflects the state of the target resource", per HTTPbis. But unlike httpRange-14, it goes no further than HTTPbis: in particular, under the proposal there is nothing that says that the representation must be the content of the resource.

>>> Question A:
>>> 
>>> Currently we have:
>>> <http://example.org/uri>; - a JPEG image of a monkey.
>>> 
>>> When you issue a GET on that URI the server currently responds
>>> 200 OK
>>> Content-Type: image/jpeg
>>> Link: <http://example.org/uri-documentation>;; rel="describedby"
>>> 
>>> So under this new proposal, the server can return the contents of /uri-documentation with a status of 200 OK for a GET on /uri?
>> Under the proposal, the server would return the JPEG with a 200 OK for a GET on /uri. http://example.org/uri-documentation would return a description of the JPEG in some machine-readable format. 
> 
> Or more accurately, the server MAY return the JPEG with a 200 OK for a GET on /uri, or it may return the same result as a successful GET on /uri-documentation (a description of the /uri in some machine readable format).

Sure, it could. For example, if /uri supported content negotiation and the client requested text/turtle, it could return some text/turtle that described the JPEG image of the monkey.

> Is this limited to machine readable format, why not human readable too?
> 
> It appears that if one can return text/turtle for a GET request on </foo>, where { </foo> a :Horse } then one should also be able to return an image/jpeg which visually describes the horse.

Yes, under the proposal that would be OK, so long as the picture was information that reflected the state of the resource, per HTTPbis.

>>> Question B:
>>> 
>>> How would conneg work, and what would the presence of a Content-Location response header mean? Would HTTPBis need to be updated?
>> I can't see any way in which any of that would work differently from currently.
> 
> Okay, given the use-case of a GET on </uri> returning 200 OK, and the response containing a representation of </uri-documentation> in text/turtle:
> 
> What would the value of the Content-Location header be? /uri-documentation?

If the response to /uri was the same representation as /uri-documentation, then yes, the Content-Location header could point to /uri-documentation. That seems consistent with the definition of Content-Location at [4].

> short version: this proposal would mean many sections of httpbis would need to be reworded and changed, as it conflicts to the point of saying the opposite.

What parts do you think would need to be reworded?

>>> Question C:
>>> 
>>> Currently 303 "indicates that the requested resource does not have a representation of its own that can be transferred by the server over HTTP", and the Link header makes it clear that you are dealing with two different things (/uri and /uri-documentation), but where does this proposal make it clear at transfer protocol level that the representation included in the http response is a representation of another resource which describes the requested resource (rather than it being as the spec defines "a representation of the target resource")?
>> The proposal says that applications can draw no conclusions from information at the transfer protocol level about /uri. In particular, it can't tell whether the representation that is returned with /uri is *the content* of /uri or *the description* of /uri. Further information about /uri (eg that it is a foaf:Person) may help the application work out that the representation was *a description*.
> 
> Wow, so every URI no longer refers to anything unless it's explicitly stated in some RDF somewhere, and if one looks up <b> in a browser and sees a picture of a monkey, they are incorrect for saying it refers to a picture of a monkey if some RDF document somewhere describes <b> as a :SpaceShuttle.
> 
> Can the TAG really just say "okay, all http:// URIs no longer refer to anything"?

This proposal does not change the fact that URIs identify resources, nor that the representation of a resource is "information that reflects the state of that resource".

What you can't tell just by looking at <b> in a browser and getting a picture of a monkey is whether the resource for which that picture is a representation is a real life monkey or the picture itself. Conversely, httpRange-14 enabled you to say that <b> definitely identified the picture itself, or rather something whose content, as an image/jpeg, was that picture.

>> However, an application can draw conclusions about /uri-documentation, assuming it gives a 2XX response, because it has been retrieved as the result of following a :describedby link (or if it were the target of a 303 redirection). The application can tell that the representation from /uri-documentation is *the content* of /uri-documentation and *the description* of /uri.
> 
> I can't see how it could tell that "the representation from /uri-documentation is *the content* of /uri-documentation and *the description* of /uri". Perhaps that it's *a* description of /uri, but certainly not that it's "the content of /uri-documentation", the proposal itself removes all notion of a representation being a representation of the current state of the requested uri.

If anything that I have said or the wording of the proposal makes you think that the proposal removes all notion of a representation being information that reflects the past state of the resource identified by the requested URI, I apologise. That is not the intention at all. It would be very helpful if you could point out what part of the proposal wording makes you think that so that I can correct it.

> if <a> is described by <b>, and <b> is described by <c>, then a GET on <a> can now return <b>, whilst a get on <b> can return <c>, and so forth, and if that :describedby triple is missing, or you don't get back RDF in some form, then you don't know what you retrieved or if the requested uri refers to it at all.

A GET on <a> results in a representation of <a>, which may be the content of <a> or a description of <a>. That representation may further say that <a> is described by <b>; if it does so, the representation of <b> must be the content of <b> (and must include within it a description of <a>).

You always know that the representation that you get from <a>, <b> and <c> is the representation of that resource.

>>>> Either way, there is no implication that what you've got from http://example.org/uri is the content of http://example.org/uri (or that http://example.org/uri identifies an information resource), but there is an implication that what you get from http://example.org/uri-documentation is the content of http://example.org/uri-documentation (and that http://example.org/uri-documentation is an information resource).
>>> Sorry I don't follow, how is there an implication from a 200 OK for <uri-a> that it's not an IR and for <uri-b> that it is an IR?
>> Because /uri-documentation was reached through a :describedby link. This extra information allows the application to draw the conclusion that the representation from /uri-documentation is *the content* of /uri-documentation.
> 
> and when you don't reach it via a ":describedby" link (as in 99.99% of cases on the web)? also see above, same points.

Yes, in most cases on the web today, under the proposal we cannot tell whether the representation you get is a description of the resource or the content of the resource. Different people have different assumptions about whether the representations on the web today are, in the main, the content of the resources identified by the URIs that are used or descriptions of the resources identified by the URIs that are used.

For example, does http://www.whitehouse.gov/ identify the real-world resource "The Whitehouse" or the abstract resource that is defined as the set of all possible representations retrievable from that URI? Looking from the outside, it is impossible to tell what the owners of that URI intend it to identify; all we can really tell is that the representations we get reflect the state of whatever that resource is.

> Apologies but I have to disagree completely here, I can say I'm a goldfish but I have the properties of a human and belong in the Set of Humans, no matter how much I say, I'm never going to be a goldfish - there's no design choice there, similarly if something a representation of something was retrieved via HTTP, then it belongs to the set of things which can have their representations retrieved via HTTP, that just is a fact, not a design decision.
> 
> Sorry this appears so negative, but... well the above hopefully explains, personally I see it as ripping the foundational constraints of the web/uris/http away to try and save an extra GET request in a few cases.


I hope I've explained why I really don't think it does anything of the sort.

Cheers,

Jeni

[1] http://tools.ietf.org/html/rfc3986
[2] http://tools.ietf.org/html/draft-ietf-httpbis-p3-payload-19#section-4
[3] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
[4] http://tools.ietf.org/html/draft-ietf-httpbis-p3-payload-19#section-6.7
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Thursday, 29 March 2012 08:36:53 UTC