- From: Chimezie Ogbuji <ogbujic@ccf.org>
- Date: Mon, 16 Aug 2010 10:48:39 -0400
- To: "SPARQL Working Group WG" <public-rdf-dawg@w3.org>
Below is a summary of Tim's comments regarding the HTTP Protocol along with a few comments of my own for the purpose of group discussion: "1) One point is that, mostly, this is a book about how to implement a linked data when you have an existing SPARQL server. This is clearly a good idea.. That could be expressed in the abstract." I think this one speaks for itself He had some comments with the motivation behind 4.2 (indirect graph identification). In particular with the statement: ".. it is often the case that the naming authority associated with the URI of an RDF graph in a Network-manipulable Graph Store is not the same as the server managing the identified RDF content, the naming authority is not available, or the URI is not dereferencable" He felt that such an authority was broken. What the text was attempting to motivate, was the various scenarios where you can't rely on using the graph IRI directly to manipulate the RDF knowledge. The primary one is situations where the graph IRI is not resolvable, but I was attempting to generalize beyond this. Looking at some of the relevant language from RFC 3986, [[[ .. domain name ownership may change over time for reasons not anticipated by the URI producer. In other cases, the data within the host component identifies a registered name that has nothing to do with an Internet host. ]]] Perhaps the text should be talking about the host not the authority: ".. it is often the case (with HTTP graph URIs in particular) that the server associated with the hostname is not the same as the server managing the identified RDF knowledge (as a result of a change in domain name ownership, for instance), the host server is not available, or the URI is not dereferencable" He preferred 4.2 be conceived "Graph Mirroring". "Often, one organization publishes or re-publishes another's data. In this case the a graph with one URI is actually published at another URI." This could be another motivating reason for indirectly identifying graphs, but I'm not sure if (by itself) it accommodates the other reasons (such as having a non-resolvable graph IRI) He suggested not using "?" as a way to 'embed' graph IRIs but to use http://example.com/rdf-graphs/www.example.org/other/graph Instead. I think this came up in discussions earlier (see: http://lists.w3.org/Archives/Public/public-rdf-dawg/2009OctDec/0030.html), but we didn't seem to have strong opinions one way or another. In that thread, Steve H had the following example, which has the advantage of encoding the entire graph IRI (including the scheme): http://localhost:8080/data/http%3A%2F%2Fexample.com%2Fdata.rdf TimBL, mentions that a rewrite rule on the server can of course turn this into the "?" form internally, but I wonder if that is a deployment / implementation detail once we have an accepted form for the protocol He gives some reasons not to us "?": - some proxies do not cache things with a "?" in as they assume that they will be transient queries never asked again. - some people might want to directly mirror sets of virtual or static RDF documents which have relative URIs between them and the ? messes that up. - when a graph is serialized and has references to nearby URIs, then the serializer will generate (sometimes much) smaller output when relative URIs can be used. Perhaps SteveH has something to say about the 1st and 3rd points (I know he had some reverse proxy scenarios in mind with this particular interface). As the for 2nd point, I'm not sure if this relates to our recent conversation about resolving relative URIs for embedded IRIs and if the solution we discussed below (which was not in the text when he reviewed it) addresses this: [[[ In situations where there is no Base URI in the payload and a graph IRI is embedded, the RDF document that represents [AWWW] the networked RDF knowledge identified by the embedded graph IRI SHOULD be considered the retrieval context (5.1.2) [RFC3986]. Thus, the default base URI is the base URI of that RDF document. ]]] We probably need some clarification for his 4th point, because I wasn't sure if this was just a comment leading up to the following point or if there was something specific he was looking for that makes it clear that for the GET, PUT, (and presumably DELETE?) verbs, HTTP Update is basically HTTP" ".. Where GET and PUT are concerned this is not a new protocol, and the document should take the position as to it is explaining how for a SPARQL service owner to support HTTP on those graphs (or rather, virtual RDF documents)." In the next point (5) he says that when a POST is done, this *is* a new protocol supported the append functionality, but again it wasn't clear to me if there was specific changes or additions to the text he was looking for. He did specifically ask that this protocol support the scenario where a POST with content-type "application/sparql" is understood to be an invokation of the SPARQL Query protocol (essentially) where the default graph of the query is the graph IRI (embedded or otherwise). This begs the larger question of the interplay between the various SPARQL 1.1 protocols in particular as it relates to overlapping HTTP bindings. Finally, he specifically asked to support the of use 'MS-Author-Via' headers, which (reading from " Read-Write Linked Data") are meant to indicate preference for modifying RDF knowledge either via PUT (if the value is 'DAV') or via SPARQL/Update if the answer is 'SPARQL' (or whatever IMT we assign to the SPARQL Update language) . Digging further (from http://msdn.microsoft.com/en-us/library/cc250217(PROT.10).aspx): [[[ This header field indicates to the issuer of an HTTP OPTIONS command what protocol mechanism is preferred for authoring documents in this particular namespace. The preference MUST be ordered so the first mechanism listed is the one most preferred by the server. ]]] So this is in some way related to the newly added suggestion to use HTTP OPTIONS to determine the capabilities of the server, but here the response is a specific indication of the preferred means of authoring RDF documents for a particular server; so it is also related to the larger question of making sense of the various bindings to HTTP 1.1 that we have. -- Chime =================================== P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
Received on Monday, 16 August 2010 14:49:31 UTC