[VM] TAG clarification on 302 vs 303, PURLS and more... from Miles, AJ \(Alistair\) on 2006-01-17 (public-swbp-wg@w3.org from January 2006)

From: Miles, AJ \(Alistair\) <A.J.Miles@rl.ac.uk>
Date: Tue, 17 Jan 2006 17:51:03 -0000
To: <public-swbp-wg@w3.org>
Message-ID: <677CE4DD24B12C4B9FA138534E29FB1D98521C@exchange11.fed.cclrc.ac.uk>
Hi all,

At the VMTF telecon today we had a further discussion of how to handle the case of ontologies with http://purl.org/*/ namespaces, given that the PURL servers return a 302 response code.

Ralph raised the point that simply asking OCLC to change 302 to 303 for all responses by the PURL server is not a reasonable solution, because a 302 "Found" response code is actually quite appropriate for the majority of purl.org URIs. (Extract from RFC2616 [1] is pasted at the end of this email.) What OCLC would have to do would be to add a feature to the PURL server whereby each PURL maintainer could specify whether or not a resource was an 'information resource', so that the server could adapt the response code appropriately. This sounds like a significant change to the code base, and I do not know whether OCLC plans to make any changes of this nature. (We should ask them!)

We felt that we would like to work with the TAG to develop a clear position wrt using purl.org URIs to name classes, properties and other types of 'non-information resource'. To further this goal, I took an action to draft some questions/requests to forward to the TAG, so here goes ...

1. The class of 'information resources'.

It would be useful if the TAG were to coin a URI for the class of 'information resources' with at least a partial definition referring to the web architecture recommendation [2] or whatever document is most appropriate. 

This would enable ontology developers to declare classes as disjoint with the class of information resources. For example, the FOAF ontology [3] could declare the class foaf:Person to be disjoint with the class tag:InformationResource.

The implied TAG position is that the meta-classes rdfs:Class, owl:Class and rdf:Property are all disjoint with the class tag:InformationResource, and therefore expected HTTP behaviour for members of these meta-classes is clearly specified by the TAG resolution on httpRange-14 [5] (i.e. they MUST return 303). However, an ontology might reasonably define a class eg:Book, or eg:Document, as a sub-class of tag:InformationResource. Without an explicit statement of disjointness or subsumption for each class in an RDFS or OWL ontology, it is not known whether an 'instance' of that class is or is not an information resource. Therefore, the range of reasonable HTTP behaviours for these 'instances' is not specified.

If the class of information resources was named, ontology developers could specify which of their classes where disjoint with this class, and which were not disjoint, supporting clear and reasonable expectations wrt HTTP behaviour for members of these classes (i.e. which resources MUST return 303, and which resources MAY return 200). 

In the document 'Configuring Apache HTTP Server for RDFS/OWL Ontologies Cookbook' [4] under development by the Vocabulary Management Task Force of the SWBPD-WG, I (as editor) have so far deliberately avoided a discussion of HTTP behaviour for 'instances' in an RDFS/OWL ontology, and [4] currently gives examples only for classes and properties. This is because of the ambiguity wrt to 'instances' as described above. We would like to be able to extend the treatment in [4] to 'instances' as well as classes and properties.

I don't think it matters if the class of information resources is not entirely or perfectly defined at this stage. An interesting possibility is that a practical definition of what an information resource *is* could emerge from the act of various prominent ontologies declaring what an information resource *is not*. (E.g. 'an information resource is not a person.')

2. Drawing inferences from HTTP interactions.

A fundamental assumption of the TAG's position on httpRange-14 seems to be that there are or will be applications that need or find it useful to make inferences about resources based on HTTP interactions with those resources.

Could the TAG provide some basic examples of such applications?

If this assumption is upheld, then it would also be useful if the TAG were to define exactly what inferences may or may not be drawn from what HTTP interactions, in the form of RDF triples as consequences.  

For example ... 

GET /foo HTTP 1.1
Host: example.com

HTTP 1.x 200 OK

implies

{
<http://example.com/foo> rdf:type tag:InformationResource.
}

... and ... 

GET /bar HTTP 1.1
Host: example.com

HTTP 1.x 303 See Other

implies

{}

3. 3xx response codes.

The TAG resolution of httpRange-14 [5] does not suggest which inferences may or may not be drawn from 3xx response codes other than 303. This is problematic for the case of RDF vocabularies such as Dublin Core and RSS 1.0 that have a http://purl.org/*/ namespace, because the PURL servers respond to all GET requests with a 302 "Found" response code. Therefore, it is unclear whether it is appropriate to use purl.org URIs to name classes, properties, or any other type of non-information resource.

Could the TAG please specify under what circumstances a 302 response code is an acceptable response for a resource that is not an information resource.

The description of the 302 response code as given by RFC2616 [1] ('The requested resource resides temporarily under a different URI') suggests to me that a 302 *could* be an appropriate response for a non-information resource, if the redirect location returns a 303.  I.e.

GET /aaa HTTP 1.1
Host: example.com

HTTP 1.x 302 Found
Location: http://elsewhere.com/aaa

GET /aaa HTTP 1.1
Host: elsewhere.com

HTTP 1.x 303 See Other

implies

{}

However, if the redirect location where to return a 200, this would lead to an inconsistency, i.e.

GET /aaa HTTP/1.1
Host: example.com

HTTP 1.x 302 Found
Location: http://elsewhere.com/aaa

GET /aaa HTTP/1.1
Host: elsewhere.com

HTTP/1.x 200 OK 

implies

{
<http://example.com/aaa> rdf:type tag:InformationResource.
<http://elsewhere.com/aaa> rdf:type tag:InformationResource.
}

... which is of course inconsistent with e.g. ...

{<http://example.com/aaa> rdf:type rdfs:Class.}

... if the TAG has declared that ...

{rdfs:Class owl:disjointWith tag:InformationResource.}

If the TAG could confirm or refute this position, and extend a similar treatment to other 3xx response codes, that would be very helpful.

---

That's as far as I've got. Rather long winded! Any thoughts?

Cheers,

Al.

[1] http://www.ietf.org/rfc/rfc2616.txt
[2] http://www.w3.org/TR/2004/REC-webarch-20041215/
[3] http://xmlns.com/foaf/0.1/
[4] http://www.w3.org/2001/sw/BestPractices/VM/http-examples/2005-11-18/
[5] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html

Extract from RFC2616 [1] describing response code 302 ...

10.3.3 302 Found

   The requested resource resides temporarily under a different URI.
   Since the redirection might be altered on occasion, the client SHOULD
   continue to use the Request-URI for future requests.  This response
   is only cacheable if indicated by a Cache-Control or Expires header
   field.

   The temporary URI SHOULD be given by the Location field in the
   response. Unless the request method was HEAD, the entity of the
   response SHOULD contain a short hypertext note with a hyperlink to
   the new URI(s).

   If the 302 status code is received in response to a request other
   than GET or HEAD, the user agent MUST NOT automatically redirect the
   request unless it can be confirmed by the user, since this might
   change the conditions under which the request was issued.

      Note: RFC 1945 and RFC 2068 specify that the client is not allowed
      to change the method on the redirected request.  However, most
      existing user agent implementations treat 302 as if it were a 303
      response, performing a GET on the Location field-value regardless
      of the original request method. The status codes 303 and 307 have
      been added for servers that wish to make unambiguously clear which
      kind of reaction is expected of the client.



---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440
Received on Tuesday, 17 January 2006 17:51:09 UTC