Re: Comments on "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs"

Timbl, the WG has considered your comments and there are a few points where
some clarification would help the conversation.  See response inline below:

> 1) One point is that, mostly, this is a book about how to implement a
> linked data when you have an existing SPARQL server.  This is clearly a good
> idea..
> That could be expressed in the abstract.

The abstract in the editor's draft has been changed, how about this as an
alternative?:

"SPARQL provides a standard way to query RDF data. The SPARQL update
language allows a user to update RDF graphs in a Graph Store at various
levels of granularity, including individual RDF statements. The protocol
described here is meant to provide a minimal set of uniform, colloquial HTTP
operations for managing a semantic web of mutable, named RDF graphs at a
strictly large level of granularity.  It also discusses some characteristics
useful for making links in a web of hypermedia so that a person or machine
can explore and manipulate them (Linked Data)"
 
> 2) I would note that 4.1 is the essential architecture of the web,
> and 4.2 is a distraction.     The philosophical bit in 4.2 about the authority
> not doing web architecture properly
> is not very well put, as you can't change who an authority for a URI is.
> You say, "Despite the convenience of using the request URI to identify
> networked RDF knowledge for manipulation, it is often the case that the naming
> authority associated with the URI of an RDF graph in a dataset is not the same
> as the server managing the identified RDF content, the naming authority is not
> available, or the URI is not dereferencable (i.e., when dereferenced, it does
> not produce a RDF graph representation)."
> That sounds like a broken naming authority which should be fixed!

Perhaps the text should be talking about the host not the authority:

".. it is often the case (with HTTP graph URIs in particular) that the
server associated with the hostname is not the same as the server managing
the identified RDF knowledge (as a result of a change in domain name
ownership, for instance), the host server is not available, or the URI is
not dereferencable"

That section was attempting to motivate the various scenarios where you
can't rely on using the graph IRI directly to manipulate the RDF knowledge.

> I would prefer 4.2 be couched as "Graph Mirroring".  This echoes the way many
> web sites
> (and FTP sites) mirror another's contents, and may even provide enhanced
> access, while not being authoritative.
> "Often, one organization publishes or re-publishes another's data. In this
> case the a graph with one URI
> is actually published at another URI."

Graph Mirroring is a very good use case, but not the only one that this
provision was meant to address.  Did you mean for this mechanism to only be
used for graph mirroring situations?
 
> I'd note that it may be useful to have a way of saying that. A simple P label
> could do.

Can you give an example of what you mean here? I didn't quite understand.

> 3) For the format with a query, you say, " As discussed in [RFC3986], query
> components are often used to carry identifying information in the form of key
> / value pairs where the value is another URI."  I actually don't like this
> design as much as one with no query.  There is no real reason to have a "?"
> (apart from it shows how your CGI is working)

> so instead of
> http://example.com/rdf-graphs/employees?graph=http%3A//www.example.org/other/g
> raph
> why not
> http://example.com/rdf-graphs/www.example.org/other/graph
> or
> http://example.com/rdf-graphs/employees/other/graph

Wouldn't such a naming convention also need to 'embed' the http/https/urn
scheme (i.e., the full URL)?  The use of '?' came about due to the needs of
extant systems and various other people that use or prefer this mechanism (
implementations being servlet based is one motivation).


> A rewrite rule on the server can of course turn this into the "?" form
> internally and the CGI apparatus fired up as a result.
> And it might be worth saying show to do that in the doc.

Some proposed text:

"HTTP servers often have a mechanism for rewriting URI requests into related
forms the follow a particular naming or addressing convention (for
instance).  HTTP proxies can be used to translate between various
alternative conventions such as: [... example ...]"

> Also some reasons not to us "?" - for example:
> -  some proxies do not cache things with a "?" in as they assume that they
> will be transient queries never asked again.

Are such proxies behaving as they should?

> - some people might want to directly mirror sets of virtual or static RDF
> documents which have relative URIs between them and the ? messes that up.
> - when a graph is serialized and has references to nearby URIs, then the
> serializer will generate (sometimes much) smaller output when relative URIs
> can be used.

Is this related to the problem of determining the base URI to use in
resolving relative URI references where the RDF payload doesn't define any
Base in content and the indirect method is used to identify the RDF
knowledge?

There was some recently proposed text [1] to address this scenario:

"In situations where there is no Base URI in the payload and a graph IRI is
embedded, the RDF document that represents [AWWW] the networked RDF
knowledge identified by the embedded graph IRI SHOULD be considered the
retrieval context (5.1.2) [RFC3986]. Thus, the default base URI is the base
URI of that RDF document."

> 4) So when a GET or PUT is done, this is an implementation of HTTP. It is not
> a new protocol, 
> in that HTTP only is used.  You can't know AND SHOULD NOT BE ABLE TO KNOW that
> in fact
> there is a SPARQL engine behind it.  That bit in caps as it is essential when
> you provide
> HTTP that you do totally support HTTP, so everything like creation date and
> expiry etc etc all hold.
> You may well use conneg as well for PUT and GET, for example.  Where GET and
> PUT are concerned
> this is not a new protocol, and the document should take the position as to it
> is explaining
> how for a SPARQL service owner to support HTTP on those graphs (or rather,
> virtual RDF documents).

I'm not sure what text you had in mind to emphasize your bold point (i.e.,
that the use of GET or PUT is just native HTTP for the purpose of
manipulating named RDF graphs ).  There have been recent changes to the
editors draft to emphasize that the implementations of the protocol are
necessarily implementations of HTTP 1.1 (since this guides how status codes,
headers, etc. are used and interpreted), but do you have something in mind
specifically?
 
> 5) When a POST  is done, this *is* a new protocol,  as HTTP doesn't mandate
> what happens
> with a POST.  
> ..snip ..

> What I would really like you to add, though, is a specification that
> when a POST is made and the content-type is application/sparql, then the
> sparql query is accepted
> (modulo access rights) and that THE DEFAULT GRAPH OF THE QUERY IS THE ONE IN
> THE HTTP REQUEST.
> Obvious really, but people can miss it.

Given that POST is not well defined, would you expect a query that made
reference to a graph with a URI other than that identified in the request to
raise a well-defined error?  In general, there still needs to be some
conversation about the various intersections with the other languages and
their HTTP protocol bindings (i.e., for instance the SPARQL query protocol
and its use with POST and how that differs from the use of POST in the HTTP
update protocol).

> 6) To complete the spec from my point of view, I think you should add the
> MS-Author-Via header taking either SPARQL or  DAV
> 
> We could make a separate spec for this, but if you could include it in yours
> that would be brilliant.

So, if I understand that header, did you have in mind for a value of 'DAV'
returned from the use of the OPTIONS method to indicate that PUT is
preferred for updating RDF knowledge and the corresponding named graph and
to make a similar assumption if the value is whatever the IMT of the SPARQL
1.1. Update language will be?

Thanks for you detailed comments.

[1] http://www.w3.org/2009/sparql/docs/http-rdf-update/

-- Chime


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.

Received on Friday, 20 August 2010 15:41:05 UTC