Re: SPARQL 1.1 Update Review (part 1)

Hi all,

As I did some edits on Update, and thought along the way I coult also go through Andy's review (part1),
I addressed many of his comments (took me more time than I expected :-) )

Find below, what's covered by my changes and what probably still needs attention...
Those not yet addressed are marked with 
'* Open'

I will turn to part 2 of the review tomorrow hopefully. 

best,
Axel

On 9 Mar 2011, at 13:48, Andy Seaborne wrote:

> ==== Review: SPARQL 1.1 Update (part 1)
> Does not cover "4 SPARQL Update Formal Model" and beyond.
> 
> 
> [**] Major
> [*]  Minor
> []   Editorial
> 
> == 1 Introduction
> 
> [] Sentences 1 and 2 say much the same thing.
> 

done. I agree, and merged them.

> [*] Standard list of documents.
> 

done. added in Status section.

> [] Remove "rationale" para. (it's not a REC, it is a time-sensitive document)

done.

> 
> [*] Mention and link being related to SPARQL 1.1 Query

done. 

> 
> [] s/SPARQL 1.1 Uniform HTTP Protocol for Managing RDF graphs/SPARQL 1.1 RDF Dataset HTTP Protocol/

done. marked as "@@final name might still change"

> 
> [*] Be careful - links go to editors working drafts, not published versions
> 

done (as far as possible)

> 
> == 1.1.1 Language Form
> 
> [] s/This operations/The Operations/
> 

done.

> [*] "informative excerpts from the formal grammar"
> 
> They aren't excepts of the formal grammar - they use their own notation and terminology.
> Better: "illustrative forms" or some such.
> 

done.

> [**] The SPARQL 1.1 Query grammar link goes to SPARQL 1.0 REC grammar.

done.

> [**] Language form shown:
> 
> WHERE is not optional with INSERT nowadays.
> 

done. changed that to the language form for DELETE/INSERT

> [**] First example is bad syntax.
> 
> INSERT ==> INSERT DATA

done. made it 

 INSERT { <http://example/egbook3> dc:title  "This is an example title" } WHERE {}

 to keep it an instance of the language form (strictly speaking it still isn't, due to the prefix, but well.

> 
> 
> [] "Here, [] indicates an optional part of the syntax"
> As ()* and ()+ is used, isn't it clearer to use ()?
> 

* Open: I didn' change that yet, since it'd need a lot of changing from [] to ? (although I agree...)


> == 1.1.2 Terminology
> 
> [**] <b>must</b>
> 
> Only the formal section needs the "must" etc language now.  All mentions in the explanatory parts can be made normal language.
> 
> There is use of <b>MUST</b>.
> 

Do you mean in section 3? I didn't touch those yet, since somethings, such as e.g. sequence of update requests, 
are not yet mentioned/handled in Section 4.


> [*] Operation: single activity.
> 
> Because a request should be atomic, I found that describing this as a "single activity" confusing".
> I was looking for a definition of 'activity'
> Suggest: "Operation - an operation is the building block for request and is a single change unit e.g. ...."
> Or just say "an operation is one of ..."
> "action" also seemed like a good word.  "Activity" feels more ongoing.
> 
done. tried the following:

" 
 <li>Operation - An action to be performed that results in the modification of data in a <a href="#graphStore">graph store</a> expressible as a single command e.g. <a href="#insertData">INSERT</a> or <a href="#deleteData">DELETE</a>.</li>
"

> [*] Request is one or more operations.
> 
> Should we allow zero operations? The empty string? Why not?
> 

* Open: *shrug* no preference from my side, left that 'as is'.

> [*] Also used terminology:
> <uri>
> mean an absolute IRI according to RFC 3987.
> Not a relative URI with string "uri".

* Open: Well, anything against changing <uri> to IRIref linking to the SPARQL grammar here?

> == 2 Graph Store
> 
> [] "repository".  In the definition it's a "mutable container".  Should the same language be used?
> 

done.


> [*] "unnamed graph"
> Elsewhere it is "default". Need to be consist. "default" is better because the syntax uses "DEFAULT".
> 
done. changed to "one (unnamed) <em>default graph</em>"

> [*] "Like an RDF dataset"
> 
> A Graph Store is a collection of slots holding graphs.  An RDF dataset is a set of one graph and some pairs (<u>, G) so it's not a container. An RDF dataset is the value of a graph store at a moment in time (between requests).

done. Changed to:

Similar to an 
<a href="http://www.w3.org/TR/sparql11-query/#sparqlDataset">RDF Dataset</a> 
operated on by the <a href ="http://www.w3.org/TR/sparql11-query/">SPARQL 1.1 Query Language</a>, a Graph Store contains one (unnamed) slot holding a <em>default graph</em> and zero or more named slots holding <em>named graphs</em>.

> 
> [**] Unnamed graph as union.
> 
> I suggest removing all discussion of union because the doc does not define the actions on a UNION.  What happens when the union is updated? Where do the changes go? Only the union? Which named graph?

* Open: I lef the sentence "Depending on implementation, the unnamed graph may refer to a separate graph, or it could be a representation of a union of other graphs." untouched so far. I think it is important to say something along these lines in the document, although I am not sure whether here and in that way is the best way to to so... If someone has concrete rewording proposals or the majority agrees with Andy's suggestion to drop this, then also the later Note referring to deletion of the default graph in implementations doing that should go.

> [*] "same pay level domain"
> What does this mean?

* done. removed ", i.e. the graph URIs do not need to be in the same pay level domain as the endpoint." ... the next sentence explains anyways what is meant.

> [**] Simple case -> graph update language.
> 
> Not true - an operation can add a graph, either via INSERT...GRAPH or LOAD INTO.
> Suggestion:
> "SPARQL 1.1 Update" can be used as a graph update language."
> 
done. changed sentence to
"
<p>In the simple case, where there is one 
unnamed graph and no named graphs, SPARQL 1.1 Update can be used as a graph update language (as opposed to a graph store update language).</p>
"

> == 2.1
> 
> [] service != endpoint.
> A service may have several endpoints.

done. s/(often referred to as an endpoint)/(often referred to by an endpoint)/

> [*] Papa1: s/update operations/update requests/
> 

done.

> [] s/MAY/<b>may</b>/


done.

> 
> == 2.3 Entailment and Consistency
> 
> [*]  "An update is completed": request or operation?
> Only requests matter - an impl could (carefully) do operations in a different order if the effect is the same as the textual order but inconsistency checking is different.
> 
> If it's request is completed, where does the exception go?
> 

* Open: I am unclear about what to do with this, particularly, what is meant with "inconsistency" here exactly. Logical inconsinstency in terms of e.g. OWL? 

> [**] Does this all tie to the SPARQL protocol?
> 

* Open: not sure what to do here

> == 3 SPARQL 1.1 Update Language
> 
> [**] Graph Management: add text to cover ADD, MOVE, COPY

done. added: "as well as convenient shortcuts for graph uptate operations often used during graph management (to add, move, and copy graphs)

> [] "Multiple operations must be separated"
> "Multiple operations are separated" -- it's a statement of fact about the grammar.
> 

done.

> [**] Operations are executed in lexical order.
> See above.
> The effects are the same as being executed in lexical order.
> I can see
> LOAD <x1> INTO GRAPH <u1>
> LOAD <x2> INTO GRAPH <u2>
> LOAD <x3> INTO GRAPH <u3>
> being cleverly done.
> 
> [*] Operations are executed in lexical order.
> They aren't :-)
> 
> Requests are (logically)
>  INSERT ... WHERE ...
> and it does the WHERE first but it's later lexically.

* Open: not sure how to address that, any rewording proposals?

> 
> == 3.1 Graph Update
> [**] Graph update operations do not explicitly delete or create them.
> 
> What about this:
> DROP GRAPH <g> ;
> INSERT DATA { GRAPH <g> { <s> <p> <o> . }
> 
> is pretty explicit about a creation by context.
> 
> The text (bullet one) does immediately say this so the intro para is confusing.

done. Reworded to:

Graph update operations change existing graphs in the Graph Store but do not explicitly delete nor create them. Non-empty inserts into non-existing graphs will <em>implicitly</em> create those graphs, i.e., an implementation must create graphs that do not exist before triples were inserted into them, and may remove graphs that are left empty after triples are removed from them.


> 
> [**] The fundamental pattern-based operations ... are INSERT and DELETE.
> 
> The fundamental operation is the INSERT..DELETE..WHERE operation.
> Either s/operations/actions/ so INSERT and DELETE are actions (but as action is not defined this is not a nice change), or reword with INSERT..DELETE..WHERE being fundamental.
> 
> Given the activity comment above, defining "action" as an inserting or deleting and using that as the building block should work document-wide.
> 

* Open (somewhat): I did the change s/operations/actions/, but it is local to that paragraph. 

> [**] "The CLEAR operation removes all the triples in a graph."
> 
> CLEAR ALL, CLEAR NAMED work on more than one graph.
> 

done. changed to :
   "The <code>CLEAR</code> operation removes all the triples in (one or more) graphs."

> == 3.1.1 INSERT DATA
> [] s/Insert data into a graph/Insert triples into graphs/
> 
> It can work on more than one graph in one INSERT DATA.

done.

> 
> [*] "graph_triples are defined".
> 
> Later, sec 3.1.2 they are "described".
> Is this a definitive definition?
>  If yes, then put in formal section and link.
>  If no, don't say "defined"
> 

done. Changed to the actual grammar graph_triples no longer used.


> [] Provide an example using GRAPH.
> Don't rely on fwd reference to 3.1.2
> 

* Open

> [**] "Default graph of the graph store"

* Open? not sure... if that was only about capitalization, then I have changed it.

> 
> == 3.1.2 DELETE DATA
> 
> [] s/Delete data from a graph/Delete triples from graphs/
> 

done.

> [**] "The graph_triples need to be matched to existing triples precisely"
> graph_triples as a whole does not need to be matched.  It isn't "matched".  The formal section does not cover DELETE DATA but for DELETE it uses a set diff, so the triple does not need to exist.  It just isn't there afterwards.

done. reworded to:
"
<p>The <code><em>QuadPattern</em></code> denotes existing triples to be removed. As with INSERT DATA, DELETE DATA is meant for deletion of ground triples data; <code><em>QuadPattern</em></code> that contain variables will not match/delete any triples in DELETE DATA requests. Since blank node labels are only unique within each specific context, blank nodes in the <code><em>QuadPattern</em></code> will not match existing data either in DELETE DATA requests. The <a href="#deleteInsert">DELETE/INSERT</a> and <a href="#delete">DELETE</a> operations must be used to remove triples containing blank nodes.</p>
"


> I think it is really saying no variables, no patterns, but this is not clear.

The grammar doesn't forbid variables at this point, does it?
I don't think we have to forbid them, triples with variables can simply be ignored.
The formal part now should reflect this.

> 
> [*] "processed" -- what does this mean?  Suggest removing the sentence.
> 

done. I was hesitant to remove the sentence alltogether, but changed it to:
<p>Note that the deletion of non-existing triples has no effect, i.e. triples in the <code><em>QuadPattern</em></code> that did not exist in the graph store are ignored.</p>


> Example 2:
> "This .. expands ..." confusing - it uses different example data and patterns.
> 
done. reworded.

> == 3.1.3 DELETE/INSERT
> 
> Suggestion: Make DELETE and INSERT are 3.1.3.1 and 3.1.3.2 respectively.
> Suggestion: 3.1.3.1. and 3.1.3.2 are examples, and omit the descriptive text that repeats what went before.
> 

* Open: I tried to make them <div4>, but there's a problem with the numbering then... Does that need to be fixed in the template?

> [*] "defines a specific graph" -- confusing, it allows a variable so it (syntax) isn't a specific graph.
> 

I think this is void with my last changes.

> [**] General comment:
> "This operation identifies data binding values to a set of variables."
> This can be read as meaning one match, not a sequence of matches.  This language is also used elsewhere and reads as if there is one match only.  Should talk about sequences of solutions.
> 

done. reworded:
 "<p>This operation identifies data with the <code>WHERE</code> clause, which will be used to compute solution sequences of bindings for a set of variables. The bindings for each solution are then substituted into the <code>DELETE</code> template to remove triples, and then in the <code>INSERT</code> template to create new triples.</p>"

 
> [*] Give a simple example of DELETE/INSERT before getting to discussing WITH, USING.
> Overall, there is a lot of use of WITH/USING and I was left with the feeling this is preferred style or necessary style, but it isn't. Suggest removing from the majority of examples, where it isn't necessary for the point being made, would be clearer.

* Open
> 
> [**] "Any remaining portions of the GroupGraphPattern which are not assigned a dataset "
> 
> Assigned a graph?  Also, it isn't assignment.  Maybe "matched against".

done. reworded to:

  "Any remaining portions of the <code><em>GroupGraphPattern</em></code> which are not in the scope of a <code>GRAPH</code> clause will be matched against the graph specified in the <code>WITH</code> clause, if present, or the default graph of the graph store otherwise.</p>"

> 
> [] s/inserted or deleted/deleted or inserted/ as that is action-order.
> 

done.


> [*] "When WITH is not present, that fallback will be the default graph."
> 

* done. Changed: 

s/
<code>WITH</code> provides a fallback to specify the graph to use when one is not explicitly stipulated via <code>GRAPH</code> or <code>USING</code>. When <code>WITH</code> is not present, that fallback will be the default graph.
/
<code>WITH</code> provides a fallback to specify a graph (different from the default graph) to use when one is not explicitly stipulated via <code>GRAPH</code> or <code>USING</code>.
/ 

?

> This over-emphasises the use/need for WITH.  It's not a fallback - it's the normal usage IMHO.
> 
> [*] "If an operation intends to create a graph"
> 
> Users can "intend", operations can't!
> 
done.

> Seems a long an complicated way of saying
> 
> INSERT ...{ GRAPH <g> {} } ...
> does not create <g>.
> 
> See also INSERT DATA.

Added some explanations on that also in INSERT DATA.

> 
> [*] Example:
> 
> Immediately uses WITH.

* Open

> 
> [*] Show an example where the WHERE has multiple matches.
> 

* Open

> == 3.1.4.
> 
> Structural comment above.
> 
> [] s/SPARQL - Query/SPARQL 1.1 Query/

done.
> 
> [*] "The resulting variable bindings" =>
> "The resulting a sequence of solutions to the WHERE clause"
> 

done.

> Example 2:
> 
> [**] Bad syntax
> s/USING/WITH/ ??
> 

done.


> [**] Bad action:
> Unbound variables in the DELETE clause ==> no effect.
> 
> To be honest, I'm tempted to suggest that DELETE templates allow and match with unbound variables.  The case of
> 
> DELETE { ?x foaf:name ?name } WHERE { ?x foaf:mbox_sha1 ?foo }
> for
> DELETE { ?x foaf:name ?name } WHERE { ?x foaf:mbox_sha1 ?foo . ?x foaf:name ?name }
> is reasonably compelling.

* Open: I fixed the example to work (i think) as intended, but changing unbound variables to behave universal seems to bring us pretty much to a similar
  discuss as we had for bnodes in DELETE... I am unsure whether we want to go down that road at this stage.


> == 3.1.5
> 
> Structural comment above.
> 
> <http://example/bookx> dc:date "2002-01-01T00:00:00-2:00" .
> [**]  No ^^xsd:dateTime

done. Added the missing ^^xsd:dateTime (at several places)

> The WHERE clause does not match.
> 
> The book has date in 2002? 1948 surely!
> 
> And it's one "P" in "David Coperfield" by Edmund Wells.
> http://www.inprint.co.uk/thebookguide/bookshop-skit.htm
> 
> This example is trying too hard.

* Open: I didn't check the rest of the example...

> == 3.1.6 DELETE WHERE
> 
> [] "Any templates" -- confusing as we are discussing a modify_template
> 

done. That has been reworded.

> [] Example 2.  Remove DOT after } -- correct but very unusual.
> 

done.

> == 3.1.7 LOAD
> 
> [*] s/protocol/URI scheme/
> 

done.

> [] Don't mention file: -- it's rather weakly defined and used in practice in ways outside the strict standard. Mentioning it is just trouble.
> 

* Open: didn't want to touch that... Alex/Paul some particulr reason you wanted to have that in?

> [**] "This URI could be a reference to a query"
> 
> It includes the query, it's not a reference.
> 
> Just remove discussion of using CONSTRUCT in a document URI.

* done. removed "This URI could be a reference to a query sent to a SPARQL endpoint, as soon as it provides a set of statements that can be parsed and loaded into the store (e.g. CONSTRUCT)."
> 
> [**] "In case no data can be retrieved."
> What about an empty graph?

* Open: good point! I did the following for the moment, let's see whether that flies with everybody:

<p>In case no RDF data can be retrieved (as opposed to the empty graph being retrieved) from <code>documentURI</code>, the SPARQL 1.1 Update service is expected to return failure. In any other case, it will always return success. If <code>SILENT</code> is present, the result of the operation will always be success. In case the graph named &lt;<strong>uri</strong>&gt; does not exist in the graph store, and the reteived graph is non-empty, it will be created; an empty graph being retrieved will not create a new graph in the graph store.</p>

BTW, I assumed that LOAD creates a graph when it doesn't exist beforehand.

> 
> == 3.1.8
> 
> [] "all triples of all named"
> 
> Either use "triples in" or "triples of" but be consistent.  It was "of" in the first part of the sentence.  To me, "in" is better.

done (where I found it)
> 
> [*] Example:
> If giving defining forms, please use one without short forms.
> 
> DELETE { GRAPH <uri> { ?s ?p ?o } } WHERE { GRAPH <uri> { ?s ?p ?o } }
> 

done.

> [**] "far reaching implications"
> Which are what?
> 
> Just don't discuss it - the spec is not defining the effect.
> 

* Open: I just didn't want to remove it yet, but I am fine to remove it (along with the comment further above)

> == 3.2 Graph Management
> 
> [**] "Graph management operations create and destroy named graphs in the Graph Store. "
> 
> Revise to include effects of ADD, MOVE, COPY.

done. reworded that all a bit:
"<p>Graph management operations allow to create, destroy, move and copy named graphs in the Graph Store, or add the contents of one graph to another. Operations for creation and destruction are not required to result in any actions, since graph stores are not required to support persistent empty named graphs.</p>
"

> 
> DROP works on the default graph.
> 

yes, but has  the same effect as clear (see formal semantics)

> [] Format similarly to 3.1 with a list mentioning each operation in this section.
> 

* Open

> == 3.2.1 CREATE
> 
> "Stores that do not record empty graphs will always return success."
> 
> So if the store has graph <g> with triple <s> <p> <o> and the request is
> 
> CREATE GRAPH <g>
> 
> what has empty graphs got to do with it?  This sentence is out of place but I don't know what it's place should be.
> 
* Open: for now made a guess and reworded to

"
<p>Stores that do not record empty named graphs will always return <em>success</em> on creation of a non-existing graph.</p>
"

which I think was the intention?

> == 3.2.2 DROP
> 
> [*] "removes the specified graph" => "removed the specified graphs"
> 
> DROP ALL acts on multiple graphs.
> 

done.

> This has a MUST in upper case.  See earlier and the termininolgy says "must".
> 

* Open 
> [**] Editors note in document must be removed before LC.
> 

* Open 

> [*] Note that DROP DEFAULT is a clear.  This is only there in a confusing way.
> 

* done. that's now said explicitly.
> == 3.2.3 COPY
> 
> Now, the notation is using (|) for alternation but this is not in 1.1.2
> 

done. I added that in section 1.1.1

> "Is equivalent to" example
> includes "GRAPH DEFAULT" in the INSERT which is illegal in a template or pattern.
> 

* Open? ... admittedly still handwaving, but changed this to :

  DROP SILENT (GRAPH <toURI> | DEFAULT);
  INSERT { ( GRAPH <toURI> | '') { ?s ?p ?o } } WHERE { ( GRAPH <fromURI> | '') { ?s ?p ?o } }

 better?

> [**] "If the destination does not exist, it will be created"
> Didn't we decide that some stores may require explicit CREATE, and that's why we have SILENT?

* Open: Hmmm, we said for INSERT before that it will do implicit CREATE, or no? I am not sure whether I misread
    "If an operation tries to insert into a graph that does not exist, then that graph must be created."
  but I read it as implicit create... Honestly, if we make a distinction here, and don't do implicit creation, I am worried the whole formal semantics definition will become trickier, I wouldn't opt for going down that road at this stage.

> [*] Put more information into initial destination graph to make it clear it's lost by doing a COPY.

* done.

> === 3.2.4 MOVE
> 
> [] Inconsistent bold.  Now "[" and "]" are bold face.
> 

* done.

> [**] GRAPH DEFAULT again.


* done.
> 
> == 3.2.5 ADD
> 
> [**] GRAPH DEFAULT again.
> 


* done.

Received on Monday, 28 March 2011 18:06:53 UTC