RE: [VM] TAG clarification on 302 vs 303, PURLS and more...

Alistair,

Excellent.  I think you have captured the issues around the TAG's
httpRange-14 decision quite well.  It might be helpful to reference some
potential interpretations of the httpRange-14 decision that I
describe[6].

The main thing I would add is that the questions you pose below are
actually symptoms of a broader issue:

4. What is the TAG-sanctioned algorithm for determining the meaning of
an http URI?  The TAG's httpRange-14 decision seems to imply that there
is such an algorithm, but does not explicitly say what the algorithm is.
If would be helpful if the TAG would explicitly say what the algorithm
is, so that people do not have to guess.

For example, below is an initial attempt I made to express the algorithm
that seemed to be implied by the TAG's WebArch document and httpRange-14
decision.

HTTP URI-Identity-Algorithm
===========================

Let original-URI be the URI whose identity you are trying to determine.

IF original-URI contains a fragment identifier,
THEN
	Let path-part-URI = the part of the URI 
		without the fragment identifier.
	Do an HTTP GET on path-part-URI.
	IF the response code is 2xx,
	THEN
		IF the specification for the returned media type
			indicates that the fragment identifier
identifies
			a portion of, or location within, the returned
content,
		THEN
			Conclude that the original-URI identifies
			an information resource.  (Example: HTML.)
		ELSE IF the specification for the returned media type
			delegates authority for defining the meaning 
			of the fragment identifier to the returned
content
			(i.e. if the specification says that statements 
			made in the returned content are authorized to 
			say what the fragement identifier identifies),
		THEN
			Conclude that the original-URI identifies
			whatever the statements in the returned
			content say the fragment identifier identifies.
			(Example: RDF)
		ELSE ???   # Other case?  What if the spec does not say?
	ELSE IF the response code is 303 and a secondary URI is
returned,
	THEN
		???? 	# Not sure what to do about the frag ID.
			# See
http://lists.w3.org/Archives/Public/public-swbp-wg/2005Dec/0123
			# for some possibilities.

# Else no fragment identifier...
ELSE IF the response code is 2xx
THEN
	Conclude that the original-URI identifies an information 
	resource, regardless of the returned content type.
ELSE IF the response code is 303
THEN
	Let secondary-URI be the new URI returned from the 303 response.
	Dereference secondary-URI, and if useful information 
	about original-URI is obtained, consider it authoritative.
ELSE IF . . . 	
	# Etc., for all other response codes

Additional Reference
[6] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Dec/0123

David Booth


> -----Original Message-----
> From: public-swbp-wg-request@w3.org 
> [mailto:public-swbp-wg-request@w3.org] On Behalf Of Miles, AJ 
> (Alistair)
> Sent: Tuesday, January 17, 2006 12:51 PM
> To: public-swbp-wg@w3.org
> Subject: [VM] TAG clarification on 302 vs 303, PURLS and more...
> 
> 
> 
> Hi all,
> 
> At the VMTF telecon today we had a further discussion of how 
> to handle the case of ontologies with http://purl.org/*/ 
> namespaces, given that the PURL servers return a 302 response code.
> 
> Ralph raised the point that simply asking OCLC to change 302 
> to 303 for all responses by the PURL server is not a 
> reasonable solution, because a 302 "Found" response code is 
> actually quite appropriate for the majority of purl.org URIs. 
> (Extract from RFC2616 [1] is pasted at the end of this 
> email.) What OCLC would have to do would be to add a feature 
> to the PURL server whereby each PURL maintainer could specify 
> whether or not a resource was an 'information resource', so 
> that the server could adapt the response code appropriately. 
> This sounds like a significant change to the code base, and I 
> do not know whether OCLC plans to make any changes of this 
> nature. (We should ask them!)
> 
> We felt that we would like to work with the TAG to develop a 
> clear position wrt using purl.org URIs to name classes, 
> properties and other types of 'non-information resource'. To 
> further this goal, I took an action to draft some 
> questions/requests to forward to the TAG, so here goes ...
> 
> 1. The class of 'information resources'.
> 
> It would be useful if the TAG were to coin a URI for the 
> class of 'information resources' with at least a partial 
> definition referring to the web architecture recommendation 
> [2] or whatever document is most appropriate. 
> 
> This would enable ontology developers to declare classes as 
> disjoint with the class of information resources. For 
> example, the FOAF ontology [3] could declare the class 
> foaf:Person to be disjoint with the class tag:InformationResource.
> 
> The implied TAG position is that the meta-classes rdfs:Class, 
> owl:Class and rdf:Property are all disjoint with the class 
> tag:InformationResource, and therefore expected HTTP 
> behaviour for members of these meta-classes is clearly 
> specified by the TAG resolution on httpRange-14 [5] (i.e. 
> they MUST return 303). However, an ontology might reasonably 
> define a class eg:Book, or eg:Document, as a sub-class of 
> tag:InformationResource. Without an explicit statement of 
> disjointness or subsumption for each class in an RDFS or OWL 
> ontology, it is not known whether an 'instance' of that class 
> is or is not an information resource. Therefore, the range of 
> reasonable HTTP behaviours for these 'instances' is not specified.
> 
> If the class of information resources was named, ontology 
> developers could specify which of their classes where 
> disjoint with this class, and which were not disjoint, 
> supporting clear and reasonable expectations wrt HTTP 
> behaviour for members of these classes (i.e. which resources 
> MUST return 303, and which resources MAY return 200). 
> 
> In the document 'Configuring Apache HTTP Server for RDFS/OWL 
> Ontologies Cookbook' [4] under development by the Vocabulary 
> Management Task Force of the SWBPD-WG, I (as editor) have so 
> far deliberately avoided a discussion of HTTP behaviour for 
> 'instances' in an RDFS/OWL ontology, and [4] currently gives 
> examples only for classes and properties. This is because of 
> the ambiguity wrt to 'instances' as described above. We would 
> like to be able to extend the treatment in [4] to 'instances' 
> as well as classes and properties.
> 
> I don't think it matters if the class of information 
> resources is not entirely or perfectly defined at this stage. 
> An interesting possibility is that a practical definition of 
> what an information resource *is* could emerge from the act 
> of various prominent ontologies declaring what an information 
> resource *is not*. (E.g. 'an information resource is not a person.')
> 
> 2. Drawing inferences from HTTP interactions.
> 
> A fundamental assumption of the TAG's position on 
> httpRange-14 seems to be that there are or will be 
> applications that need or find it useful to make inferences 
> about resources based on HTTP interactions with those resources.
> 
> Could the TAG provide some basic examples of such applications?
> 
> If this assumption is upheld, then it would also be useful if 
> the TAG were to define exactly what inferences may or may not 
> be drawn from what HTTP interactions, in the form of RDF 
> triples as consequences.  
> 
> For example ... 
> 
> GET /foo HTTP 1.1
> Host: example.com
> 
> HTTP 1.x 200 OK
> 
> implies
> 
> {
> <http://example.com/foo> rdf:type tag:InformationResource.
> }
> 
> ... and ... 
> 
> GET /bar HTTP 1.1
> Host: example.com
> 
> HTTP 1.x 303 See Other
> 
> implies
> 
> {}
> 
> 3. 3xx response codes.
> 
> The TAG resolution of httpRange-14 [5] does not suggest which 
> inferences may or may not be drawn from 3xx response codes 
> other than 303. This is problematic for the case of RDF 
> vocabularies such as Dublin Core and RSS 1.0 that have a 
> http://purl.org/*/ namespace, because the PURL servers 
> respond to all GET requests with a 302 "Found" response code. 
> Therefore, it is unclear whether it is appropriate to use 
> purl.org URIs to name classes, properties, or any other type 
> of non-information resource.
> 
> Could the TAG please specify under what circumstances a 302 
> response code is an acceptable response for a resource that 
> is not an information resource.
> 
> The description of the 302 response code as given by RFC2616 
> [1] ('The requested resource resides temporarily under a 
> different URI') suggests to me that a 302 *could* be an 
> appropriate response for a non-information resource, if the 
> redirect location returns a 303.  I.e.
> 
> GET /aaa HTTP 1.1
> Host: example.com
> 
> HTTP 1.x 302 Found
> Location: http://elsewhere.com/aaa
> 
> GET /aaa HTTP 1.1
> Host: elsewhere.com
> 
> HTTP 1.x 303 See Other
> 
> implies
> 
> {}
> 
> However, if the redirect location where to return a 200, this 
> would lead to an inconsistency, i.e.
> 
> GET /aaa HTTP/1.1
> Host: example.com
> 
> HTTP 1.x 302 Found
> Location: http://elsewhere.com/aaa
> 
> GET /aaa HTTP/1.1
> Host: elsewhere.com
> 
> HTTP/1.x 200 OK 
> 
> implies
> 
> {
> <http://example.com/aaa> rdf:type tag:InformationResource. 
> <http://elsewhere.com/aaa> rdf:type tag:InformationResource. }
> 
> ... which is of course inconsistent with e.g. ...
> 
> {<http://example.com/aaa> rdf:type rdfs:Class.}
> 
> ... if the TAG has declared that ...
> 
> {rdfs:Class owl:disjointWith tag:InformationResource.}
> 
> If the TAG could confirm or refute this position, and extend 
> a similar treatment to other 3xx response codes, that would 
> be very helpful.
> 
> ---
> 
> That's as far as I've got. Rather long winded! Any thoughts?
> 
> Cheers,
> 
> Al.
> 
> [1] http://www.ietf.org/rfc/rfc2616.txt
> [2] http://www.w3.org/TR/2004/REC-webarch-20041215/
> [3] http://xmlns.com/foaf/0.1/
> [4] 
> http://www.w3.org/2001/sw/BestPractices/VM/http-examples/2005-11-18/
> [5] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
> 
> Extract from RFC2616 [1] describing response code 302 ...
> 
> 10.3.3 302 Found
> 
>    The requested resource resides temporarily under a different URI.
>    Since the redirection might be altered on occasion, the 
> client SHOULD
>    continue to use the Request-URI for future requests.  This response
>    is only cacheable if indicated by a Cache-Control or Expires header
>    field.
> 
>    The temporary URI SHOULD be given by the Location field in the
>    response. Unless the request method was HEAD, the entity of the
>    response SHOULD contain a short hypertext note with a hyperlink to
>    the new URI(s).
> 
>    If the 302 status code is received in response to a request other
>    than GET or HEAD, the user agent MUST NOT automatically 
> redirect the
>    request unless it can be confirmed by the user, since this might
>    change the conditions under which the request was issued.
> 
>       Note: RFC 1945 and RFC 2068 specify that the client is 
> not allowed
>       to change the method on the redirected request.  However, most
>       existing user agent implementations treat 302 as if it 
> were a 303
>       response, performing a GET on the Location field-value 
> regardless
>       of the original request method. The status codes 303 
> and 307 have
>       been added for servers that wish to make unambiguously 
> clear which
>       kind of reaction is expected of the client.
> 
> 
> 
> ---
> Alistair Miles
> Research Associate
> CCLRC - Rutherford Appleton Laboratory
> Building R1 Room 1.60
> Fermi Avenue
> Chilton
> Didcot
> Oxfordshire OX11 0QX
> United Kingdom
> Email:        a.j.miles@rl.ac.uk
> Tel: +44 (0)1235 445440
> 
> 
> 

Received on Monday, 23 January 2006 21:00:32 UTC