- From: Eran Hammer-Lahav <eran@hueniverse.com>
- Date: Fri, 6 Feb 2009 17:03:01 -0700
- To: "www-talk@w3.org" <www-talk@w3.org>
- CC: Mark Nottingham <mnot@mnot.net>, Jonathan Rees <jar@creativecommons.org>, "Roy T. Fielding" <fielding@gbiv.com>
In HTTP-based Resource Descriptor Discovery [1], I am trying to define a uniform way to attach metadata (descriptors) to resources. The idea is to define three methods for obtaining the location (URI) of the descriptor document via the resource (URI or representation). All three methods use the 'describedby' relation type. 1. <LINK> elements in HTML, XHTML, and Atom documents. 2. Link: headers in HTTP responses. 3. /site-meta documents [2], using a Link-Template (transforming the resource URI to the descriptor URI using a URI template). A descriptor contains information about a resource, but it is hard to define this association in practical terms (that can translate directly to code). Instead, the proposal defines the descriptor as 'information about a resource identified by a URI'. In the current draft I tried to use the HTTP status codes (obtained with the first two methods, <LINK> and Link:), by instructing the client to follow redirects and only use links from a small subset of status codes (200, 303, 401). This approach proved broken for 2 reasons: 1. It is up to the application to decide how redirects should be followed. If a URI (when dereferenced and requested using an HTTP GET) returns a 307, any links associated with that response may contain valid metadata that is not the same as the metadata describing the URI the user-agent is being redirected to (which in this example returns a 200). 2. It makes information obtained from <LINK> and Link: inconsistent with that obtained from /site-meta. /site-meta has no way of follow redirects (it is a static transformation template) and will always produce a URI identifying the location of the descriptor associated with the 307 response, not the follow-up 200. To address that, I started taking a different approach with my upcoming revision (-02) that basically tries to ignore HTTP status codes. It moves the focus away from the 'resource' to the URI. But Roy's recent comment made this approach (ignoring HTTP status codes) incomplete as well. On 2/6/09 11:03 AM, "Roy T. Fielding" <fielding@gbiv.com> wrote: > There are many resources involved in HTTP, > only one of which is identified by the requested URI. Each of those > resources may have representations, and the meaning of the payload in a > response message is defined by the status code. A 404 response is going > to contain a representation of a resource on the server that describes > that error. A 200 response is going to contain a representation of the > resource that was identified as the request target. What this means is that a Link header in the HTTP response to a GET request might not be about the resource identified by the URI used to make that request. For example, if: GET /resource/1 HTTP/1.1 Host: example.com returns: HTTP/1.1 404 Not Found Link: <http://example.com/about>; rel="describedby" The Link is about the "resource on the server that describes that error", and not about the resource identified by the URI (http://example.com/resource/1). Because /site-meta does not provide access to the HTTP status code, if it returned http://example.com/about as the descriptor location of http://example.com/resource/1, it would be incorrect (due to lack of information about the 404 condition involved). In such a case, it is really Link: header that is limited because the representation of the resource isn't available (and therefore no place to put its links). --- I am trying to find a way to keep the three methods in sync without further limiting the usefulness of this protocol. So far the only approach I have is to limit Link elements and headers (for use in this protocol) to HTTP responses with a status code that can only be interpreted as about the request URI. >From a (very) quick review of the status codes, this means only the following codes do not bind the response representation to the request URI: * 1xx * 202 - about the request's status, is this the same as the resource? * 205 - does not seem to represent anything. * 303 - not sure. * 4xx, except maybe 406 - not sure, seems to be about the resource. * 5xx This seems to suggest most 2xx, most 3xx, and maybe 406, as the only valid status codes to be allowed when looking for a 'describedby' link. If this approach is acceptable, should the spec explicitly define which status codes are valid? Or make do with a definition of 'HTTP responses with a status code that is a representation of the request URI'. The second option is generally preferred but at this point, even the spec author (me) cannot fully determine how to implement it (as indicated by the 'not sure' above). Comments? EHL [1] http://tools.ietf.org/html/draft-hammer-discovery-01 [2] http://www.ietf.org/internet-drafts/draft-nottingham-site-meta-00.txt
Received on Saturday, 7 February 2009 00:03:50 UTC