Comments on "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs" from Tim Berners-Lee on 2010-08-09 (public-rdf-dawg-comments@w3.org from August 2010)

From: Tim Berners-Lee <timbl@w3.org>
Date: Mon, 9 Aug 2010 09:09:12 -0400
To: public-rdf-dawg-comments@w3.org
Cc: Sandro Hawke <sandro@w3.org>, foaf-protocols@lists.foaf-project.org, SW-forum Web <semantic-web@w3.org>
Message-Id: <91B13244-E783-421E-B2FE-D01288D066DF@w3.org>

Hi, Chimezie et al,

I have just read "
SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs"

Here are my comments on this useful document.

1) One point is that, mostly, this is a book about how to implement a
linked data when you have an existing SPARQL server. This is clearly a good idea..
That could be expressed in the abstract.

2) I would note that 4.1 is the essential architecture of the web,
and 4.2 is a distraction. The philosophical bit in 4.2 about the authority not doing web architecture properly
is not very well put, as you can't change who an authority for a URI is.
You say, "Despite the convenience of using the request URI to identify networked RDF knowledge for manipulation, it is often the case that the naming authority associated with the URI of an RDF graph in a dataset is not the same as the server managing the identified RDF content, the naming authority is not available, or the URI is not dereferencable (i.e., when dereferenced, it does not produce a RDF graph representation)."
That sounds like a broken naming authority which should be fixed!
I would prefer 4.2 be couched as "Graph Mirroring". This echoes the way many web sites
(and FTP sites) mirror another's contents, and may even provide enhanced access, while not being authoritative.
"Often, one organization publishes or re-publishes another's data. In this case the a graph with one URI
is actually published at another URI."

I'd note that it may be useful to have a way of saying that. A simple P label could do.

3) For the format with a query, you say, " As discussed in [RFC3986], query components are often used to carry identifying information in the form of key / value pairs where the value is another URI." I actually don't like this design as much as one with no query. There is no real reason to have a "?" (apart from it shows how your CGI is working)

so instead of
http://example.com/rdf-graphs/employees?graph=http%3A//www.example.org/other/graph
why not
http://example.com/rdf-graphs/www.example.org/other/graph
or
http://example.com/rdf-graphs/employees/other/graph

A rewrite rule on the server can of course turn this into the "?" form internally and the CGI apparatus fired up as a result.
And it might be worth saying show to do that in the doc.

Also some reasons not to us "?" - for example:
- some proxies do not cache things with a "?" in as they assume that they will be transient queries never asked again.
- some people might want to directly mirror sets of virtual or static RDF documents which have relative URIs between them and the ? messes that up.
- when a graph is serialized and has references to nearby URIs, then the serializer will generate (sometimes much) smaller output when relative URIs can be used.

4) So when a GET or PUT is done, this is an implementation of HTTP. It is not a new protocol,
in that HTTP only is used. You can't know AND SHOULD NOT BE ABLE TO KNOW that in fact
there is a SPARQL engine behind it. That bit in caps as it is essential when you provide
HTTP that you do totally support HTTP, so everything like creation date and expiry etc etc all hold.
You may well use conneg as well for PUT and GET, for example. Where GET and PUT are concerned
this is not a new protocol, and the document should take the position as to it is explaining
how for a SPARQL service owner to support HTTP on those graphs (or rather, virtual RDF documents).

5) When a POST is done, this *is* a new protocol, as HTTP doesn't mandate what happens
with a POST. I think the idea of using it as an append function, when you POST turtle (or RDF/XML) to it
is reasonable.

What I would really like you to add, though, is a specification that
when a POST is made and the content-type is application/sparql, then the sparql query is accepted
(modulo access rights) and that THE DEFAULT GRAPH OF THE QUERY IS THE ONE IN THE HTTP REQUEST.
Obvious really, but people can miss it.
This is a really handy convention. It is written up in http://www.w3.org/DesignIssues/ReadWriteLinkedData.html .
There are at least 3 implementations of it (in PHP, perl and python+C++) running now.
It means that the messages POSTED to the server

6) To complete the spec from my point of view, I think you should add the MS-Author-Via header taking either SPARQL or DAV

We could make a separate spec for this, but if you could include it in yours that would be brilliant.

KUTGW

Tim

Received on Monday, 9 August 2010 13:09:22 UTC