Re: Naive question on redirection. from Alan Ruttenberg on 2008-06-07 (public-awwsw@w3.org from June 2008)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Fri, 6 Jun 2008 20:38:27 -0400
To: Jon Phipps <jphipps@madcreek.com>
Cc: SWIG Web <semantic-web@w3.org>, Lee Feigenbaum <lee@thefigtrees.net>, Phil Archer <parcher@icra.org>, Rees Jonathan <jar@creativecommons.org>, Tim Berners-Lee <timbl@w3.org>, Leo Sauermann <leo.sauermann@dfki.de>, public-awwsw@w3.org
Message-Id: <7BD4388D-E674-4EF8-9D23-44955B30D846@gmail.com>
On Jun 6, 2008, at 6:25 PM, Jon Phipps wrote:

> I agree with Alan that the way the Cool URIs doc is worded doesn't  
> exactly clarify this situation.
>
> And to answer Phil's original question...
> "Do I know what colour http://www.example.org/home.asp is?"
>
> Short answer --
> No. 'http://www.example.org/home.asp' is at that point an  
> information resource, a document, that was retrieved instead of the  
> requested document. It's important to make careful distinctions  
> between URIs as UR-Identity and URIs treated as UR-Location when  
> dereferenced.
>
> Way longer answer --
> A server that responds to a URL with a 302 is providing an  
> ambiguous response and is saying simply (anthropomorphizing a  
> server) "I found the document you requested somewhere other than  
> the location you requested." Full stop.

Hi Jon,

Glad to see you jump in to the fray. Let me continue to disagree ;-)  
Reminder again of the definition of 302:

> 302 Found : The requested resource resides temporarily under a  
> different URI.


So, as a matter of fact, it says: "The requested resource resides  
temporarily under a different URI" not "I found the document you  
requested somewhere other than the location you requested."

The *requested resource* *resides*.  We'll have to look at these  
words and apply some interpretation to them, but I'm having a hard  
time seeing how any interpretation supports the one you make.

> That should have no effect on what you can or cannot infer about  
> the URI that was dereferenced. It's only talking about the document  
> you requested from a URL.

No, we don't yet know what  the thing in question is yet. We've made  
a request. Looking at httprange-14, we see that there are two  
possibilities - either the resource is an "information resource":

> The distinguishing characteristic of these resources is that all of  
> their essential characteristics can be conveyed in a message. We  
> identify this set as “information resources.”
> This document is an example of an information resource. [AWWW]
>
or it isn't. If, at some point, we get back a 200 response(unless  
first getting a 303), then we know (according to the architecture)  
that we have named an information resource. That resource might be a  
document, or it might be some other sort of information resource.  
(Note that you have equated IR and document in your wording, but by  
saying "this document is an *example* of an IR" AWWW implies that  
there are IRs that are not documents)

Before we get that 200, assuming that we are ignorant of the identity  
and any other knowledge of the resource, we just don't know what it  
is. It might be a person. Supposing it is, we've speculatively asked  
the web to GET that person.

The *requested resource* is that person. *requested* = GET,  
*resource* = person.

You've asserted, that, having received a 302 "'http://www.example.org/ 
home.asp' is at that point an information resource"

I would like to know what, in the statement "302 Found : The  
requested resource resides temporarily under a different URI." has  
transmuted your GET of a person, into a GET of a information resource?

Now, admittedly, "residing under a URI" isn't the sort of thing that  
people do. That wording could use a little tuning. But the key thing  
in the response is "The requested resource". We've only mentioned one  
resource so far: http://example.org/  (something that can have a  
color, btw, which IRs can not). That's the requested resource.

Prior to httprange-14, if we could read the 303 documentation as  
something like what you now interpret 302 as.

"303 See Other: The response to the request can be found under a  
different URI"

Here we are not saying anything about resources - rather we are  
talking about responses to requests - things that you(and the spec)  
describe as content. The english interpretation of this would be,  
anthropomorphizing the server "this other resource I am naming will,  
if you GET it, will respond in the same way as I would have responded  
if I had responded directly to your request.

That response would be a representation (what a GET returns), which,  
although it is not a document in the IR sense, is easily, and  
reasonably, confused to be one.

Post httprange-14, we have "If an "http" resource responds to a GET  
request with a 303 (See Other) response, then the resource identified  
by that URI could be any resource". This interferes with the previous  
interpretation. Why? Suppose the resource is not an IR. Then  
returning a 200 is not something the redirecting server would have  
responded. Yet the new request made in response to a 303 might return  
a 200, and that 200 is not to be construed as implying that the  
original resource was an information resource, since according to  
httprange-14 it could be any resource, and some resources are not IRs.

> The 303 and 307 status codes were added to clear up _some_ of this  
> ambiguity. [1]
>
> As Tim Berners-Lee pointed out [2], a 303 response, "... separates  
> a thing from a document about a thing". In "Best Practice Recipes  
> for Publishing RDF Vocabularies" [3], we recommend responding with  
> a 303 rather than a 302 when providing a content negotiated  
> response for just this reason -- a 303 response is explicit about  
> the fact that you're getting back a different document than the one  
> requested (if a 'document' even exists at that location).

It is far from explicit. All we have is "If an "http" resource  
responds to a GET request with a 303 (See Other) response, then the  
resource identified by that URI could be any resource;"

That's it. It would be good if we had something explicit that said  
that the new URI couldn't possibly name the same resource as original  
one we asked to GET. As it stands we need to infer that it might name  
a different resource because otherwise we would have the  
contradiction I mention above - the original resource is a person,  
but after a 303 the next server replies with a 200.

> Again the actual URL of the served document implies no relationship  
> to the original URI except to say "Here's another document that  
> I've been instructed to return when that URL (with that Accept- 
> header) is requested, do with it what you will." Not nearly as  
> ambiguous a response as a 302, although it's still not officially  
> quite as explicit a relationship to the original requested document  
> as Tim's 'about'.
>
> The 307 response is also explicit about the fact that even though  
> the server was instructed to return a document from a different  
> location than you requested, it's still the same document you asked  
> for, this location is just temporary, and you should keep asking  
> for the document at the original location in the future.
>
> When we're talking about content negotiation we're talking about  
> _content_ rather than identity, so I don't think it's reasonable to  
> make inferences about relationships between URIs based on a server  
> response to a URL.

Content negotiation refers to the process by which the server  
attempts to return to you a representation of the sort you would  
prefer. It's scope is the request. It is satisfied by the  
representation that the server sends back. If no representation is  
sent back  and you get a redirect you make *another* request, in  
which you may choose to content negotiate again. So, I would offer,  
the fact we are doing content negotiation tells us *nothing* in the  
case of a redirect. All we can go on, is: "302 Found : The requested  
resource resides temporarily under a different URI"

Now, let me make it clear: I find that the current specifications  
lead to all sorts of non intuitive, and sometimes even contradictory  
interpretations. I don't fault you for making the interpretation you  
do. You describe perfectly reasonable behavior.

It's just that the specifications don't seem to support your quite  
reasonable interpretation. What I would consider damaging would be  
that we all ignore (in slightly different ways, mind you) what the  
specs say and just "do the right thing".

What should we do? Fix the specs.

-Alan



> Well, with the possible exception of a 301 status code which  
> basically says "I have been instructed to inform you that the  
> document you requested no longer exists at the requested location  
> and will _never, ever_ be at that location again. It's now at an  
> entirely different location." This certainly _implies_ that the URI  
> that's being dereferenced may no longer be the correct identity of  
> that resource, but I don't think you can even infer that.
>
> Just my $.02. Hmmm, more like $1.95.
>
> Jon Phipps
> Co-editor "Best Practice Recipes for Publishing RDF Vocabularies"
>
> [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.3
> [2] http://lists.w3.org/Archives/Public/public-awwsw/2008Feb/0027.html
> [3] http://www.w3.org/TR/swbp-vocab-pub/
>
> On Jun 6, 2008, at 4:09 PM, Alan Ruttenberg wrote:
>
>>
>>
>> On Jun 6, 2008, at 3:11 PM, Lee Feigenbaum wrote:
>>
>>> I should have prefaced my comments by saying that I've never dug  
>>> deeply into this, and have just tried to learn from what I've  
>>> heard and seen from others. So I'm speaking from a position of  
>>> inexperience.
>>>
>>> Alan Ruttenberg wrote:
>>>
>>>>> <http://thefigtrees.net/id> a foaf:PerosnalProfileDocument .
>>>>>
>>>>> 302's based on Accept: headers to either
>>>>>
>>>>> http://thefigtrees.net/id.n3
>>>>> http://thefigtrees.net/id.rdf
>>>> That's not how CN is supposed to work. You respond to the  
>>>> request with the representation, not with a redirection. The  
>>>> Location header is where the resource is. 302 is different.
>>>
>>> I got my example from the recent SWEO publication, "Cool URIs for  
>>> the Semantic Web". Please see:
>>>
>>> http://www.w3.org/TR/cooluris/#conneg
>>>
>>> is that example incorrect?
>>
>> This one, I presume.
>>> Content negotation is often implemented with a twist: Instead of  
>>> a direct answer, the server redirects to another URL where the  
>>> appropriate representation is found:
>>>
>>> HTTP/1.1 302 Found
>>> Location: http://www.example.com/people/alice.en.html
>>> The redirect is indicated by a special Status Code, here 302  
>>> Found. The client would now send another HTTP request to the new  
>>> URL. By having separate URLs for different representations, this  
>>> approach allows Web authors to link directly to a specific  
>>> representation.
>>>
>>>
>>
>> It looks incorrect to me. But what do I know? I just read the specs.
>>
>> Also incorrect: One doesn't link to (awww) representations, one  
>> links to resources.
>>
>> They also say:
>>
>>> Note that the URI of this representation is passed back in the  
>>> Content-Location header, this is not required but a recommended  
>>> good practice
>>>
>>>
>> I'd say in your case it would be required ;-)
>>
>> -Alan
>>
>>
>> (You are in a maze of twisty passages, all alike.)
>>
>>>
>>> Lee
>>>
>>>> 302 Found
>>>> The requested resource resides temporarily under a different  
>>>> URI. Since the redirection might be altered on occasion, the  
>>>> client SHOULD continue to use the Request-URI for future requests.
>>>> Not: you can find a different resource - a fixed resources,  
>>>> which happens to have an awww:representation that is the same as  
>>>> the one redirected from.
>>>>> What if I wanted to include an ex:mimeType triple about the  
>>>>> latter ones?
>>>> Go ahead. However I don't think 302 is appropriate in that case.  
>>>> Respond with the representation to the original request, and put  
>>>> these URLs in the Location: header. Then there is no encumbrance.
>>>>>
>>>>> <http://thefigtrees.net/id.rdf> a  
>>>>> foaf:PerosnalProfileDocument ; ex:mimeType "application/rdf+xml" .
>>>>>
>>>>> Or are you suggesting that this is some strange one-way  
>>>>> equivalence? (If X -- 302 --> Y and X p q then Y p q?)
>>>> I'm not trying to suggest anything. I'm trying to answer  
>>>> according to what the specs say. I'd be happy to be shown to be  
>>>> wrong, either because the specs don't mean what I think they do,  
>>>> or because there is contradictory information somewhere else, or  
>>>> with an assertion that the specs need to be fixed.
>>>> -Alan
>>>> (don't you just love those "naive" questions?)
>>>>> Lee
>>>>>
>>>>>> On Jun 6, 2008, at 4:57 AM, Phil Archer wrote:
>>>>>>>
>>>>>>> Suppose I have this triple
>>>>>>>
>>>>>>> <http://example.org/> ex:colour "red"
>>>>>>>
>>>>>>> and when I dereference the URI I get a 302 redirect to http:// 
>>>>>>> www.example.org/home.asp.
>>>>>>>
>>>>>>> Do I know what colour http://www.example.org/home.asp is?
>>>>>>>
>>>>>>> I'm pretty sure the answer's no, but has anyone else grappled  
>>>>>>> with the joys of redirects in this way?
>>>>>>>
>>>>>>> Phil.
>>>>>>>
>>>>>>> -- 
>>>>>>> Phil Archer
>>>>>>> Chief Technical Officer,
>>>>>>> Family Online Safety Institute
>>>>>>> w. http://www.fosi.org/people/philarcher/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>
>>
>
Received on Saturday, 7 June 2008 00:39:08 UTC