- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Mon, 18 Apr 2011 14:57:09 +0100
- To: public-rdf-dawg@w3.org
On 18/04/11 00:13, Axel Polleres wrote: > Hi all, > > trying to catch up with my actions and especially with Update > semantics... taking some closer look at Dataset-UNION vs. > Dataset-MERGE, since we now have a definition of Dataset-MERGE in > query... Non-technical: It would be better to have a definition of dataset-union that is specifically useful for SPARQL Update. 1/ It isolates us (SPARQl-WG) from decisions of RDF-WG around datasets 2/ There are specific issue to do with subgraphs where we want very precise handling of bnodes. By the way: The current defn of Dataset-MERGE is wrong (as Peter PS has pointed out) and needs fixing. > My overall impression is that we actually may want to switch to > Dataset-MERGE for all our definitions in SPARQL update as well... > explained in the following: > > 1) For the Insert Data Operation > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_insertdataoperation > , it seems to me that we don't really want to be reusing bnode labels > (assuming that an agent inserting Data is not aware of the bnode > lables in the graph store anyways. > > i.e. > > INSERT DATA { _:a :p :o } > > should IMO insert a new bnode label, rather than using _:a as the > label to be inserted. labels are a different matter. If you have two parsings (same bytes sent in two requests): INSERT DATA { _:a :p :o } you get two bNodes. bNode label is scoped to a file bNode The general text in "12.3.2 Treatment of Blank Nodes" (SPARQL 1.0) http://www.w3.org/TR/rdf-sparql-query/#BGPsparqlBNodes talks about """ The scoping graph is purely a theoretical construct; in practice, the effect is obtained simply by the document scope conventions for blank node identifiers. """ or to put it another way, bNodes have global identity but that identity is not the same as label or document identifier. And in particular, a bNode can be in two graphs. One graph is known to be a subgraph of the other. If we copy over some triples from one graph to another, then find the bnode agin: INSERT { GRAPH <G> { ?s :label "Hello" . } } WHERE { ?s :key 57 . # Finds a bNode. } .. later .. same request or different request ... DELETE { GRAPH <G> { ?s :label "Hello" . } } INSERT { GRAPH <G> { ?s :label "Hello2" . } } WHERE { ?s :key 57 . # Finds a bNode. } should find the same bNode (or at least that to be a legal implementation of SPARQL Update). It's the round-trip problem for SPARQL results, made to exists solely inside one store, without serializing/deserializing via the result set format. > > 2) The Delete Insert Operation > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_deleteinsertoperation > > anyways relies on the Dataset() function (cf. http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_datasetPattern) > which "skolemises" bnodes away. before ... and which does not need to do that because bNodes have identity, it's just not their label. Even if it's done by sk() the definition of that needs tightening up to make it stable across requests. That "fresh constant" is not more than the bNode identity. Because sk-1() exists, it is a name for the bNode, (not the thing denoted by the bNode, which is what skolemization does). 4.2.4 Dataset(QuadPattern, P, GS ) "the original bnode labels." the labels are a syntax-only feature. Another fix needed for sk() is that the "fresh constant" must not collide with any term in a request, nor any future request. Making it something other than a IRI or literal (or bNode!) does this. But then it's exactly treating bNodes as having identity, so just use the bNode itself. Easiest fix seems to be to just put "labels" as a syntax feature and explains that labels in syntax are not global or graph store-wide names for bNodes. Editorial/major: We also need to decide what happens when the same label is used multiple times in one request. There is text for this but it's buried. 3.1.1 has : """ Blank node labels in QuadDatas are assumed to be disjoint from the blank nodes in the Graph Store and will be inserted as new blank nodes. """ but 3.1.2 has: """ Since blank node labels are only unique within each specific context """ what exactly is a 'context'? Discussion is only for INSERT DATA and DELETE DATA, not the pattern operations. > > 3) ... I think the Dataset() function of > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_datasetPattern > can be changed actually from > > Dataset(QuadPattern, P, GS ) = Dataset-UNION( sk-1( > Dataset(QuadPattern, μ) ) | μ in eval(sk(GS)(sk(DG)),P) ) > > to > > Dataset(QuadPattern, P, GS ) = sk-1 (Dataset-MERGE( > Dataset(QuadPattern, μ) ) | μ in eval(sk(GS)(sk(DG)),P) ) > > without changing of meaning... (again, since bnodes have been > skolemised away and are only re-introduced via the final sk-1) > > 4) In > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_loadoperation > ... it is actually ok to use Dataset-MERGE, since you don't want to > reuse bnode-labels coming from an external Graph. > > 5) The use of Dataset-UNION() in > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_clearoperation > can be changed to Dataset-MERGE without altering semantics. Just on this, "merge" does not give a stable bNode identification anyway. Merge (graph, dataset) can complete replace every bNode by another (DS has same meaning, but is not the same sets of triples). > > 6) The use of Dataset-UNION() in > http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_createoperation > can be changed to Dataset-MERGE without altering semantics. BTW strictly, skolemization does alter semantics. http://www.w3.org/TR/rdf-mt/#prf """ a graph should not be thought of as being equivalent to its Skolemization """ > these seem to be all uses of Dataset-UNION(), please let me know if I > am missing something. I prefer the use of bNode identity (at least across the graph store) so that issues of bNodes in two graphs are clear. > > best, Axel > Andy
Received on Monday, 18 April 2011 13:57:48 UTC