RE: httpRange-14: Use Case for RDF [302 versus 303 redirects] from Booth, David (HP Software - Boston) on 2005-12-19 (public-swbp-wg@w3.org from December 2005)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Mon, 19 Dec 2005 16:40:06 -0500
To: "David Wood" <dwood@softwarememetics.com>, <public-swbp-wg@w3.org>
Message-ID: <A5EEF5A4F0F0FD4DBA33093A0B07559008911DA2@tayexc18.americas.cpqcorp.net>
David Wood's use case (archived at
http://www.w3.org/2002/02/mid/6F9F6968-CB73-427A-8682-AF9AB0F2E9C2@softw
arememetics.com;list=public-swbp-wg ) indirectly raises the question of
whether a 302 response (such as what purl.org does) could conform to the
TAG's httpRange-14 decision instead of 303 responses.  

Summary of my opinions: 
1. 302 redirects should not be used as an alternative to 303 redirects,
because their meaning is unclear.
2. purl.org should change to use 303 HTTP status code instead of 302
(though possibly also permit 302 if desired).
3. URIs using 303-redirects should probably avoid fragment identifiers
because it is not clear how the fragment identifier should be
interpreted.

Details and analysis below.

> -----Original Message-----
> From: David Wood
> Sent: Monday, September 05, 2005 4:54 PM
> 
> Hi all,
> 
> This message suggests a use case for SemWeb URIs to be resolved via  
> 303 responses.  
> . . .
> Example:
> 
> An RDF statement includes a predicate, http://purl.org/dc/elements/ 
> 1.1/creator.  Resolution of that URI results in a 303 response.  The  
> 303 response includes a URI, http://dublincore.org/2003/03/24/ 
> dces#creator.  A SemWeb application can determine that the predicate  
> is known and that further information regarding it is available at  
> the second (dublincore.org) URI.

I believe David Wood intended the above example as an illustration of
how it *should* operate, rather than how it actually operates, since
purl.org currently uses 302 redirects.  I think it would be reasonable
to ask OCLC to change purl.org to permit 303 responses[1], and
potentially also 302 responses.  

302 VERSUS 303
The TAG's httpRange-14 decision[2] is about what you can infer from the
HTTP status code and content-type that are returned when you do a GET on
a URI.  But what if that URI does a 302 redirect to a secondary URI?
Does the TAG decision apply only to the result of a GET on the original
URI?  Or does the decision apply instead to the result of a GET on the
secondary URI?  The decision does not mention this case, and I think
there are reasonable arguments for both interpretations.  There are two
basic cases to consider, depending on whether the original URI has a
fragment identifier.  

CASE 1: Original URI has no fragment identifier.

INTERPRETATION 1.a: A 302 response from the original URI itself
indicates that the original URI identifies an information resource,
regardless of what the secondary URI returns.  

INTERPRETATION 1.b: By itself, a 302 response from the original URI does
not indicate whether the original URI identifies an information
resource.  Instead, the response from the secondary URI should be
treated as though it were the response from the original URI, and this
response may or may not provide information about what the original URI
identifies.

CASE 2: Original URI has a fragment identifier.

INTERPRETATION 2.a: A 302 response from the original URI (minus the
fragment identifier) indicates that the original resource identifies an
information resource if the content-type (of this response) indicates
that the fragment identifier indentifies an information resource,
regardless of what the secondary URI returns.  For example, if the
content-type of the initial 302 response is text/html, then the original
URI (with fragment identifier) identifies an information resource.

The problem I see with this interpretation is that the content-type of
the initial 302 response really pertains only to that 302 response
itself -- not to the secondary content that the requester presumably
wanted.  For example, if http://purl.org/foo#red is the original URI,
and http://purl.org/foo returns a 302-redirect with content-type
text/html to http://foo.example.org/colors , we would conclude that
http://purl.org/foo#red identifies an information resource even if
http://foo.example.org/colors returns RDF/XML saying that #red is a
color.

INTERPRETATION 2.b: By itself, a 302 response from the original URI does
not indicate whether the original URI identifies an information
resource.  Instead, the response from the secondary URI should be
treated as though it were the response from the original URI, and this
response may or may not provide information about what the original URI
identifies.  

The problem with this interpretation is what to do with the original
fragment identiier.  Should it be retained and applied to the secondary
URI, or should it be discarded?  What if the secondary URI also has a
fragment identifier?

David Wood and I tested a few browsers (IE6, Firefox, Safari, Amaya and
I don't recall what else) to see what they do with the fragment
identifier when a URI does 303 redirect, and the result was that some
browsers apply the original fragment identifier to the secondary URI and
some discard it.  For example, if the original URI is
http://original.example.org#red but http://original.example.org
303-redirects to http://secondary.example.org then some browsers display
the results as http://secondardy.example.org#red  and some display the
results as http://secondardy.example.org.  

Given this evidence of varying behavior in a parallel case (302
redirects instead of 303 redirects) it seems unwise to make assumptions
about what the behavior should be for 303 redirects in the absence of
clear guidance in the HTTP specification.

CONCLUSIONS
It is unclear what a 302 redirect should mean, in trying to determine
what a URI identifies.  Unless this is clearified by the TAG, I think it
would be best to stick with URI minting and redirecting mechanisms whose
meaning is clearer, specifically:

	- hash URIs that are not redirected (suitable for URIs that 
	  identify information resources, and for URIs that identify
other
	  things if the media type will always be RDF/XML or similar);
and

	- slash URIs that are redirected using HTTP 303 status code
	  (suitable for identifying anything).
  
For this reason, I think it would make sense to change purl.org to use
303 redirects instead of 302, or perhaps permit the response code to be
a configuration option for the URI sub-space owner.

Finally, since it is not clear what the user agent treatment of fragment
identifiers should be in the presence of 303 redirects, it may be wise
to avoid them when minting URIs that intend to use 303 redirects.  [An
HTTP expert should please correct me if the HTTP specification is
already clear on this!]

References
1. HTTP status codes:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
2. TAG's httpRange-14 decision:
http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
3. DanBri's TAG question about 302 versus 303:
http://lists.w3.org/Archives/Public/www-tag/2005Jul/0012.html
4. RFC 3986 (URI syntax):
http://www.ietf.org/rfc/rfc3986.txt
Received on Monday, 19 December 2005 21:40:53 UTC