Re: SPARQL 1.1 Update Review (part 2) from Axel Polleres on 2011-03-16 (public-rdf-dawg@w3.org from January to March 2011)

From: Axel Polleres <axel.polleres@deri.org>
Date: Wed, 16 Mar 2011 15:39:20 +0000
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <584F0D66-9856-48AC-9FA1-8E2FDF4D3819@deri.org>
Hi Andy, all,

> [*] An RDF dataset is a set { DG, (<u_i>, G_i)} -- write it same as
> query has it, not "DG' union {(iri'j, G'j) | 1 <= j <= m})"


Indeed, an RDF dataset is a set:

            { G, (<u1>, G1), (<u2>, G2), ... (<un>, Gn) }

that is just the same as writing 
 
           { G } union  {(iri'j, G'j) | 1 <= j <= n }

so, G would need to be in parentheses at least, I see.

BTW, I think we should probably just unify the definitions of Graph Store and Dataset.


Next, I was thinking a bit about the following:

>> Dataset(modify_template, P) = {  instantiate(modify_template)  |  μ a solution of P }
>> 
>> instantiate(modify_template) = ....
> 

I couldn't really come around for a definition of instantiate(), 
but - at least inspired by your suggestion - I think something like the following would work:

----------------------------------------------------------------------

=======================================
Auxiliary Definition: Dataset(modify_template,  &mu; )

Let &mu; be a solution mapping. 

* For a modify_template of the form '{ TriplesBlock }'
 Dataset(modify_template,  &mu; ) 
is the Dataset consisting of only a default graph composed by 
all valid RDF triples obtained from substituting the variables in 
TriplesBlock according to &mu; and combining the triples 
into a single RDF graph by set union.

* For a modify_template of the form 'GRAPH VarOrIRIref { TriplesBlock }'
 Dataset(modify_template,  &mu; ) 
is the Dataset consisting of the empty default graph and a named 
graph &mu;(VarOrIRIref) composed by all valid RDF triples obtained from substituting 
the variables in TriplesBlock according to &mu; and combining the triples 
into a single named RDF graph by set union.

* For a complex modify_template of the form '{ modify_template1 modify_template2 }' 
  Dataset(modify_template,  &mu; ) = Dataset-UNION ( Dataset(modify_template1,  &mu; ) , Dataset(modify_template2,  &mu; ) )
=======================================


=======================================
Definition: Dataset(modify_template, P, GS )

Let sk() is a bijection that replaces every bnode identifier in the graph store GS with a unique fresh constant 
and sk^-1() is the inverse mapping to sk() reintroducing the original bnode labels.

 Dataset(modify_template, P, GS ) = 
     Dataset-UNION( sk^-1( Dataset(modify_template, &mu;) ) ) over all &mu; such that &mu; is a solution of P over Dataset sk(GS) 

=======================================

Here, the application of sk() prior to query evaluation guarantees that co-referent bnode identifiers in GS are 
not "lost" during pattern evaluation, cf. 17.3.2 Treatment of Blank Nodes of SPARQL1.1 Query.

----------------------------------------------------------------------

The functions sk and sk^-1 are needed to address the problem we discussed in [1]

I can also attempt to put that in the xml form necessary for Update.

best,
Axel


1. http://lists.w3.org/Archives/Public/public-rdf-dawg/2011JanMar/0328.html
> 

On 10 Mar 2011, at 16:48, Andy Seaborne wrote:

> ==== SPARQL Update (part 2)
> This completes my review.
> 
> Covers section 4 onwards but also ..
> 
> === 3.1.1 INSERT DATA
> 
> [**]
> """
> INSERT DATA { graph_triples }
> 
> Graph triples are defined as:
> 
>    graph_triples ::= TriplesBlock | GRAPH <uri> { TriplesBlock }
> """
> 
> This disallows:
> 
> INSERT DATA { :s :p :o . GRAPH :g { :s1 :p1 :o } }
> INSERT DATA { GRAPH :g2 {:s :p :o } . GRAPH :g { :s1 :p1 :o } }
> 
> Is there a reason for this?
> The grammar allows it.
> 
> Its seems unnecessary to force the application to separate out the triples.
> 
> This is repeated:
> = 3.1.2 DELETE DATA
> = 3.1.3 DELETE/INSERT
> modify_template ::= ConstructTriples | graph_template
> = 3.1.4 DELETE
> = 3.1.5 INSERT
> 
> 
> == Section 4:
> 
> [**]
> I suggest a section on how certain forms map to other forms, then must
> define the fundamental forms.
> 
> Rewrites for ADD, COPY, MOVE (some text exists elsewhere but should be
> in the formal section)
> DELETE WHERE, DELETE {} WHERE, INSERT {} WHERE
> 
> Maybe CLEAR as well.
> 
> then define DELETE{}INSERT{}WHERE{}, LOAD, CREATE, DROP, INSERT DATA,
> DELETE DATA.
> 
> Something on WITH and USING to formalise them as syntactic features.
> There is material elsewhere but I feel the formal section should be
> self-contained able to cover all SPARQL Update.
> 
> 
> [**]
> Need an account of how the syntax maps to the operations.  It's fairly
> obvious but probably should be said.
> 
> == 4.1.1 Graph Store
> 
> []  s/associated to/associated with/
> 
> [*] Say the IRIi are distinct.
> 
> [] It says: "1 <= i <= n" but nothing about n
> 
> == 4.1.2 Update Operation
> 
> The "t+1" notation isn't used anywhere.
> 
> As the state of a store only depends on the previous state and the
> operation and not t-2, it's not necessary.
> 
> Is this definition used anywhere?  I could immediately see that it's
> needed and wondered if it is historical now.
> 
> == 4.2 Auxiliary Definitions
> == 4.2.1 Dataset-UNION
> 
> [*] An RDF dataset is a set { DG, (<u_i>, G_i)} -- write it same as
> query has it, not "DG' union {(iri'j, G'j) | 1 <= j <= m})"
> 
> [**] Not merge - this must be a union. not rename blank nodes apart.
> Otherwise one operation followed by another will not update the same
> bNode.  And  datseta-diff is not going to work.
> 
> == 4.2.2 Dataset-DIFF
> 
> [*] dataset comment as dataset-union.
> [**] Its says "merge" (bullet 3). Should be set-difference or minus.
> [] G_j should be G sub j.
> 
> == 4.3.1 Insert Data Operation
> 
> """
> graph_triples, i.e. either a dataset consisting of a single named graph
> and an empty default graph
> """
> [**] As we have defined dataset-union, I think this should be dataset
> union, nor limited to one graph.  See also the graph_triples issue above.
> 
> == 4.3.2 Delete Data Operation
> 
> [**] graph_triples
> 
> == 4.3.3 Delete Insert Operation
> 
> """
> Triples are identified as they match a particular Group Graph Pattern P.
> """
> [**] The triples here are the ones to be deleted or inserted - they are
> not identified by matching - there is a template stage in between.
> 
> [**] Define modify_template sub DEL  and  modify_template sub INS
> 
> [**]  Dataset(modify_template, P)
> 
> Write this out formally:
> 
> Dataset(modify_template, P) = {  instantiate(modify_template)  |  μ a
> solution of P }
> 
> instantiate(modify_template) = ....
> 
> 
> These are superseded if there is an abbreviated forms section:
>    == 4.3.4 Delete Operation
>    == 4.3.5 Insert Operation
>    == 4.3.6 Delete Where Operation
> 
> 
> == 4.3.7 Insert Where Operation
> [**] What's this used with?
> "Insert Where ... are *deleted* from the Graph Store"
> 
> == 4.4.1 Create Operation
> 
> [*] Either something on what happens about empty graphs or, in the
> section intro, say the definitions assume we can have empty graphs.  the
> latter is probably better.
> 
> 
> == 5 Conformance
> [*] remove / update name to "RDF Dataset HTTP Protocol"
> 
>
Received on Wednesday, 16 March 2011 16:01:12 UTC