Review of SPARQL Update submisson (ACTION-53) from Simon K Johnston on 2009-07-21 (public-rdf-dawg@w3.org from July to September 2009)

From: Simon K Johnston <skjohn@us.ibm.com>
Date: Mon, 20 Jul 2009 21:33:48 -0400
To: public-rdf-dawg@w3.org
Message-ID: <OFF46555B3.A34FDAF9-ON852575FA.00020120-852575FA.0008967A@us.ibm.com>
Below are my comments on the UPDATE submission (
http://www.w3.org/Submission/2008/SUBM-SPARQL-Update-20080715/).

In general I think the submission is in good shape, and I did review some 
of the recent comments (from Axel, Andy and Luke) and feel that the 
discussion around the submission is also heading in the right direction. 
Specific comments follow.

1.1 Scope and Limitations

I would like to see something specifically added on the transaction 
discussion, to the effect that the language does not prescribe any 
transaction model and no guarantees are provided or implied by this 
specification (allowing implementers to add transactions to their products 
as a value-add if they wish).

2 Examples

It feels as if an example showing LOAD would be valuable, as noted in some 
of the comments we expect LOAD to be a key feature in bulk insert 
scenarios and introducing it early makes it feel a peer to the other more 
common actions. My fear is that leaving it out of this introductory 
section makes it feel second-class.

As per the comment from Axel in his review I think the addition of a MOVE 
would be valuable. For people used to SQL the notion of doing an INSERT 
and DELETE feels natural, but the main reason for that is that they can 
wrap the two actions in a single transaction. As noted above we've taken 
transactions away from the language and so providing an action that an 
implementer can optimize for atomicity seems valuable. The provision of 
MOVE would, for example, allow a store backed by and RDBMS to wrap the 
underlying SQL actions in a transaction so that the MOVE is atomic, 
whereas an INSERT followed by DELETE could not guarantee an atomic 
operation.

4.1 Graph Update

I was at first tempted to agree with Axel on the separation of INSERT DATA 
and INSERT INTO, but Andy's comments on the ability for processors to deal 
with INSERT DATA in a streaming fashion seemed a reasonable argument. I 
certainly don't think the requirement to support the two forms is an 
onerous one for implementers, and if it allows them to perform additional 
optimization it seems an acceptable "overhead".

On graph existence, we have different wording in different subsections:

4.1.3 - "If the operation is on a graph that does not exist, an error is 
generated"
4.1.4 - " The graph URI, if present, must be a valid named graph in the 
Graph Store"
4.1.5 - " The graph URI, if present, must be a valid named graph in the 
Graph Store"
, also, 
4.2.1 "service generates an error if the graph referred by the URI already 
exists"
4.2.2 "service, by default, is expected to generate an error if the 
specified named graph does not exist"

4.1.1, 4.1.2, 4.1.6 and 4.1.7 say nothing about this.

It seems to me that either a paragraph be added in 4.1/4.2 to cover all 
cases or consistent wording be used in all the subsections. 

4.1.7 CLEAR

In regard to the statement:

        "This operation does not remove the graph from the Graph Store."

I would like to see some clarification on the meaning here, how would the 
fact that the graph still exists affect other operations, queries, etc.

5 Security Issues

It seems that this is really out-of-band for this specification; again if 
we look at SQL or XQuery, an implementer may have a store that has 
security constraints and the failure cases are signalled by the API as 
error conditions. In the update protocol we can be clear about the error 
reporting cases, in this specification we are not defining the API and so 
it feels out of scope. Also, any attempt to get into any details may end 
up prescribing the security model itself which is definitely out of scope.

Thanks,

Simon K. Johnston (skjohn@us.ibm.com) - STSM, Jazz Foundation Services
Mobile: +1 (919) 200-9973
Office: +1 (919) 595-0786 (tie-line: 268-6838)
Blog: http://www.ibm.com/developerworks/blogs/page/johnston
Received on Tuesday, 21 July 2009 01:34:34 UTC