Fwd: SPARQL 1.1 Update

added this comment to http://www.w3.org/2009/sparql/wiki/Comments
volunteers to pick it up, welcome!

Axel

Begin forwarded message:

> Resent-From: public-rdf-dawg-comments@w3.org
> From: "Richard Newman" <rnewman@twinql.com>
> Date: 9 January 2010 00:32:52 GMT
> To: <public-rdf-dawg-comments@w3.org>
> Subject: SPARQL 1.1 Update
> archived-at: <http://www.w3.org/mid/4DE4ACE6-BA40-4180-95F2-CFE6EBAB7175@twinql.com>
> 
> Hi folks,
> 
> A few questions/comments on the Update portion of the 1.1 draft:
> 
> * DELETE/MODIFY/INSERT are described in terms of CONSTRUCT templates. 
> CONSTRUCT templates allow blank nodes, which are generated as fresh 
> blank nodes for each input row. This makes sense for INSERT, but it 
> doesn't make sense for DELETE: the fresh blank node will never match a 
> triple in the store, than thus
> 
>    DELETE { ?s ?p [] } WHERE { ?s ?p ?o }
> 
> is a no-op by definition. It would be good for this issue to be 
> addressed in the spec, with one of the following possible resolutions:
> 
>    1. Forbid blank nodes in a DELETE template.
> 
>    2. Define those blank nodes as being null placeholders, such that
> 
>        DELETE { ?s _:x _:y } WHERE { ?s rdf:type rdfs:Class }
> 
>       would delete every triple whose subject is an rdfs:Class.
> 
>    3. Document that DELETE triple patterns containing blank nodes will 
> never match.
> 
> * INSERT et al permit multiple "INTO" URIs:
> 
>    INSERT [ INTO <uri> ]* { template } [ WHERE { pattern } ]
> 
> but the text discusses the graph in the singular ("The graph URI, if 
> present, must be a valid named graph..."). Is it intended that '*' 
> actually be '?'?
> 
> If not, the text should be changed, and text added to describe how an 
> implementation should process multiple graphs: e.g., should they run 
> DELETE then INSERT on each graph in turn, or should all DELETEs be 
> batched together prior to the INSERTs?
> 
> * Re atomicity: it would seem that, for systems which will allow 
> multiple SPARQL/Update requests within a single transaction, the 
> requirement that "Each request should be treated atomically by a 
> SPARQL-Update service" is onerous. I don't know of too many systems 
> that support sub-transactions, and thus implementations will be forced 
> to take one of two routes:
> 
>    1. Violating the spec: "sorry pal, that doesn't apply: our 
> transactions have multi-request scope"
>    2. Annoying users: "sorry pal, we aborted your transaction because 
> SPARQL 1.1 says we have to, even though you wanted to handle it 
> yourself".
> 
> Neither choice is beneficial to users (the former because it reduces 
> their ability to rely on the spec). I'd suggest changing the language 
> to require that implementations provide "some method of atomically 
> executing the entire contents of a SPARQL/Update request", which 
> allows for the execution of a request within an existing transaction, 
> as well as for approaches that execute requests within their own new 
> transaction.
> 
> * There doesn't seem to be any mention at all of responses in the 
> draft. Is that intentional?
> 
> * Re LOAD: if we can have CREATE SILENT, how about LOAD SILENT, which 
> doesn't fail (and abort your transaction!) if the LOAD fails?
> 
> * I'd like to throw my 2¢ in for Issue 20.
> 
> It strikes me as a little short-sighted to assume that every store 
> operates with first-class graph objects, such that they can be created 
> and deleted in a closed-world fashion: not only does this conflict 
> with some implementations (e.g., those which use quad stores to 
> efficiently implement named graphs, and those which dynamically load 
> data from a graph on an ad hoc basis), but it also is dissonant with 
> the "triple stores are caches of the semantic web" open-world view.
> 
> I see in emails text like "We have agreed on the need to support a 
> graph that exists and is empty"[1]. I would like to see strong 
> supporting evidence for this in the spec (or some other persistent and 
> accessible place) before resolving this issue. I personally don't see 
> any need to distinguish an empty graph (after all, it's easy to add an 
> all-bnodes triple to it to make it non-empty but without excess 
> meaning).
> 
> I note that there is no proposal for CREATE SUBJECT (or PREDICATE or 
> OBJECT), nor CREATE LANGTAG. I see little point in unnecessarily 
> special-casing one value space to reduce its dynamism.
> 
>  From interactions with users, I expect that "oh, you mean I have to 
> CREATE a graph before I can use it in an INSERT query?" will be a 
> common question, and "always preface your query with CREATE SILENT..." 
> the pervasive response. Seems like a waste of time to me.
> 
> (Regardless of the official outcome of the issue, my implementation is 
> unlikely to strictly follow the CREATE/DROP behavior, because it would 
> be inefficient to track graphs for the sole purpose of throwing errors 
> in edge cases. CREATE will be a no-op, and DROP will be identical to 
> CLEAR.)
> 
> Thanks for your time.
> 
> -Richard Newman
> 
> [1] <http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JanMar/0070.html 
>  >
> 

Received on Saturday, 9 January 2010 01:27:07 UTC