Re: Review of "SPARQL 1.1 Update"

To pull out one point significant outcome of the discussion below:

A shortform pattern delete using the requires ";" syntax:

   DELETE { :x :p 123 . :x :q 456 }

is confusingly like

   DELETE DATA { :x :p 123 . :x :q 456 }

but does different things.  Discussion below.

	Andy


On 11/01/2010 4:56 PM, Axel Polleres wrote:
> Thanks Andy for clarification, right, what I had in mind with ADD was indeed what we already have with INSERT DATA ...
> was just puzzled by the "assymetry" of shortcuts without having really though it through.
>
> So, trying to summarise, the options on the table are the following:
>
> (DELETE {P1})?
> (INSERT {P2})?
> WHERE {P3}
>
> where at least one of INSERT or DELETE part needs to be present, do I understand that correct as the common understanding of the long form?

Yes - full form modify - it would be two grammar rules:

DELETE {T1} (INSERT {T2} )? WHERE {P}

INSERT {T1} WHERE {P3}

where T* is a template (inc GRAPH) and P is a SPARQL query pattern,

(Aside it's easiest if "DELETE WHERE" (with variable whitespace) is a 
token in it's own right so no confusion with DELETE above. It does nto 
have to be done like that - it might be seen as clearer to the parser 
writer if it is.)

> Shortcut options
>
> OPTION1:
>   * DELETE {P} for DELETE P WHERE P, but enforcing separators ";" for avoidung ambiguity

Yes - presumably ";" everywhere as a separator (trailing one can be 
missed out)

LOAD <myData.ttl> ;
INSERT DATA { <#me> foaf:name "Me" } ;
INSERT { <#me> foaf:knows ?x } WHERE { <#otherMe> foaf:known ?x } ;
DELETE { <#otherMe> foaf:known ?x } ;
LOAD <someMoreData.ttl> INTO <foo>

> OPTION2:
>   * new keyword REMOVE, i.e. REMOVE {P} for "DELETE P WHERE {P}"

Yes - and {P} here is a template-like pattern not a full pattern.

(not a syntax issue)
A template-like pattern is a triples pattern + GRAPH and where bNodes 
are variables, not ground terms, unlike a CONSTRUCT template.

> OPTION3:
>   * Make  '{P1}' optional but require 'WHERE', i.e.
>    DELETE WHERE {P1} for DELETE {P} WHERE {P}

Again, "DELETE WHERE" takes a template-like pattern, not a full pattern.

DELETE WHERE { <#alice> foaf:knows ?y }

> Anything I forgot?
>
> ============================================================================
> More remarks:

(non-syntax issues)

> * some potential other issue for the next round:
>   "INTERT DATA {P}" could in fact be viewed as a "shortcut" for "INSERT {P} WHERE {}", yes?

It could be viewed as such (DELETE is different) but there is a 
restriction on {P} to be triples with no variables.  Making DELETE DATA 
and INSERT DATA a pair seems clearer to me.

INSERT { ?x :p ?q } WHERE {}

is legal, if pointless.  c.f. CONSTRUCT.

INSERT DATA { ?x :p ?q } is a parser error.

 From an implementation point of view, knowing what follows is data 
makes a big difference.

At the point where the parser reaches "INSERT DATA {" it knows that only 
ground triples (quads) follow until the closing }.  These might be able 
to be sent straight to the graph store but at least streamed to 
temporary location which only has to be streamed back once.  A very 
large DATA {} block can be handled.

If it's a pattern, with variables, it's harder - the pattern may need to 
be run against a query pattern WHERE {...} so it can't be streamed to 
the store.

Despite having HTTP POST/PUT, I think we still need a scalable data load 
operation in the SPARQL 1.1 Update.

> This raises the question whether DELETE DATA {P} isn't actually to be viewed as a restricted version of
> the SHORTCUT we discuss above where P is ground, isn't it?

Good point - it's not a further restriction of DELETE WHERE {T}

(It's closer to DELETE {Ti} WHERE {Ti} for each Ti a triple or DELETE 
WHERE {Ti} - bnodes are still different as to whether they are variables 
or ground terms (and can't be in the data c.f. <_:label>).)

> That is, the question is whether DELETE DATA P
> is the same as writing "DELETE WHERE {P}".

Same issues of data vs pattern.  An implementation can "stream delete".

DELETE DATA removes the triples even if others in the block don't exist.

> If course this very much depends on how the semantics of DELETE DATA is to be defined, i.e. is DELETE DATA
> also successful, if only a part of it matches?

It does not "match" as a whole - it results in the graph not containing 
those triples afterwards regardless of whether they were there before.

For consistent interpretation, it is better to think of INSERT DATA and 
DELETE DATA as a pair of opposites.  INSERT is adding triples, DELETE 
removing them; no "matching" involved.

Using ADD/REMOVE for these was in an very early (pre-submission) text - 
the preference was for DELETE DATA and INSERT DATA.

> Assume: the default graph (store) has only ont triple  t1 and I call:
>    DELETE {t1 t2}
> Also here  there are two options for dealing with this situation:
>   i) the DELETE succeeds and deletes the 1st triple, or succeeds with a warning (this behavior would indeed be different from DELETE WHERE {P} )
>   ii) the DELETE fails (however, actually this behavior would make DELETE DATA
>       redundant wrt. the DELETE P WHERE P shortcut, however it looks)
>
> So, if i) was intended, at least I haven't yet seen this aspect discussed/mentioned in the current draft,
> (hope I haven't overlooked this having been discusssed already again...looking at the current Web version update).

Not quite sure which DELETE you mean: DELETE DATA or DELETE shortcut 
with pattern (DELETE WHERE)


The cases are:
1/  DELETE WHERE {t1 t2}
2/  DELETE DATA {t1 t2}
3/  DELETE { ?s ?p ?o } WHERE { ?s ?p ?o FILTER (?s ... ?p ... ?o ) }
4/  DELETE {t1 t2} WHERE {}

1 => does not match as a BGP - no effect
2 => deletes t1, results in empty graph.
3 => very messy filter : either it || or it && and behaves differently.
4 => deletes t1, results in empty graph.

Incidentally, seeing that, I now dislike a shortform pattern delete of

DELETE {t1 t2}

because of the similarity with deleting data.

> * Whatever way we choose, we have other open issues on what such request should return, but this is then more an issue for protocol,
> e.g. the number of added/deleted triples could be informative.

Firstly, it's hard to return anything because there may be multiple 
operations in one request, which with numbers to return.

Secondly, some systems might not necessarily know how many are "new" 
triples as opposed to replacing triples, when doing bulk inserts.

	Andy

>
> Axel
>

Received on Tuesday, 12 January 2010 10:56:36 UTC