Re: Request for feedback: HTTP-based Resource Descriptor Discovery from Jonathan Rees on 2009-02-01 (www-talk@w3.org from January to February 2009)

From: Jonathan Rees <jar@creativecommons.org>
Date: Sat, 31 Jan 2009 23:55:16 -0500
To: Eran Hammer-Lahav <eran@hueniverse.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>, Phil Archer <phil@philarcher.org>, Mark Nottingham <mnot@mnot.net>, "www-talk@w3.org" <www-talk@w3.org>
Message-Id: <1F946998-719B-4BFD-AED2-8981C6409CD3@creativecommons.org>

Let's work out this redirection case, since nothing else matters if we  
can't agree on this. I'll get back to your other questions later.

The problem with your treatment of redirects is that the protocol can  
give the wrong answer.

The situation is that we do a GET/HEAD of a URI U, and receive a  
301/302/307 specifying Location: V. Your protocol is supposed to get a  
description resource for the resource "identified" (RFC 3986) by U,  
yet you will throw away a DR in the response to GET/HEAD U (one that  
is explicitly said to be a DR of U) and look for one in the response  
to GET/HEAD V instead. What makes you think that V names the same  
resource as U? If it doesn't, V's DR has no bearing on the resource  
named by U. Even if you assume they do name the same resource (which  
you can't in the 307 case), why would you have any reason to prefer  
the V DR to the U DR? The ability to serve a resource's  
representations does not necessarily make you better qualified than  
anyone else to describe it.

You may want to say: Well, the U and V resources have the same  
representations (GET behavior), so doesn't that mean they're the same  
resource?  I don't think it follows. In particular there are other  
methods to consider, such as POST. As far as I know all GETs can be  
the same and the resources can still be different.

The only theory I know of for deciding which resource is supposed to  
be named by a URI is that articulated in the W3C web architecture  
recommendation [1]. This says that it is up to a party known as the  
URI's "owner" to bind the URI to some resource. So if you want to  
learn about a named resource, it is up to the URI owner to determine  
what resource it is you want to learn about. Why should you talk to  
anyone else, if the owner is willing to speak (via Link:)?

I think it is practical and reasonable that *if* U's owner provides no  
DR, then we can risk taking a 301 redirect (and maybe 302) to mean  
that V names the same resource, so that V's DR, if any, describes that  
resource. But an explicit Link: on a redirect has to mean that the URI  
owner, who is an "authority", is trying to say something important to  
you about the resource, such as the ways in which it differs from the  
redirect target.

Even if U and V are assumed to name the same resource, or resources  
that cannot be distinguished, it is very easy to come up with cases  
where either DR is vastly preferable to the other; differences in  
credibility due to deception, reliability, competence, and timeliness  
can go either way. If you ask a librarian, they will say that the  
original publisher (V) is rarely to be trusted to provide good  
metadata, and one should consult a competent metadata service to  
obtain such (U). (This is a real use case.)

There is a practical reason to prefer the U DR: it can be obtained in  
one roundtrip, while getting the V DR takes two.

I also wonder how the redirect case is any different from that of a  
proxy server that adds a Link: header. If you could detect that the  
proxy server added it, and not the origin (you can't), would you throw  
away the proxy server specified DR, even when the origin provided none?

-Jonathan

[1] http://www.w3.org/TR/webarch/#uri-assignment

Received on Sunday, 1 February 2009 04:55:55 UTC