- From: Axel Polleres <axel.polleres@deri.org>
- Date: Wed, 20 Apr 2011 15:20:10 +0100
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-dawg@w3.org
On 18 Apr 2011, at 14:57, Andy Seaborne wrote: > > > On 18/04/11 00:13, Axel Polleres wrote: >> Hi all, >> >> trying to catch up with my actions and especially with Update >> semantics... taking some closer look at Dataset-UNION vs. >> Dataset-MERGE, since we now have a definition of Dataset-MERGE in >> query... > > Non-technical: > > It would be better to have a definition of dataset-union that is specifically useful for SPARQL Update. > > 1/ It isolates us (SPARQl-WG) from decisions of RDF-WG around datasets > > 2/ There are specific issue to do with subgraphs where we want very precise handling of bnodes. > > By the way: The current defn of Dataset-MERGE is wrong (as Peter PS has pointed out) and needs fixing. Would a modified version of the definition we have for Dataset-UNION http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_datasetUnion such that s/where union between graphs is defined as set-union of triples in those graphs./ where union between graphs is defined as RDF merge of those graphs./ do better? (in spirit... I mean, probably I would word it differently) I don't think that this definition suffers from ambiguities as claimed by Peter. I didn't mean to argue for bringing us in a position to be dependent on RDF WG decisions, but rather wanted to argue that it seems to me that for our purposes Dataset-MERGE instead of Dataset-UNION could be useful. I will try to find time to check the further implications in the coming few days, I just wanted to ask for the moment best, Axel > >> My overall impression is that we actually may want to switch to >> Dataset-MERGE for all our definitions in SPARQL update as well... >> explained in the following: >> >> 1) For the Insert Data Operation >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_insertdataoperation >> , it seems to me that we don't really want to be reusing bnode labels >> (assuming that an agent inserting Data is not aware of the bnode >> lables in the graph store anyways. >> >> i.e. >> >> INSERT DATA { _:a :p :o } >> >> should IMO insert a new bnode label, rather than using _:a as the >> label to be inserted. > > labels are a different matter. > > If you have two parsings (same bytes sent in two requests): > > INSERT DATA { _:a :p :o } > > you get two bNodes. > > bNode label is scoped to a file > bNode > > The general text in "12.3.2 Treatment of Blank Nodes" (SPARQL 1.0) > > http://www.w3.org/TR/rdf-sparql-query/#BGPsparqlBNodes > > talks about > > """ > The scoping graph is purely a theoretical construct; in practice, the effect is obtained simply by the document scope conventions for blank node identifiers. > """ > or to put it another way, bNodes have global identity but that identity is not the same as label or document identifier. > > And in particular, a bNode can be in two graphs. > One graph is known to be a subgraph of the other. > > If we copy over some triples from one graph to another, then find the bnode agin: > > INSERT { GRAPH <G> { ?s :label "Hello" . } } > WHERE > { ?s :key 57 . # Finds a bNode. > } > > .. later .. same request or different request ... > > DELETE { GRAPH <G> { ?s :label "Hello" . } } > INSERT { GRAPH <G> { ?s :label "Hello2" . } } > WHERE > { ?s :key 57 . # Finds a bNode. } > > should find the same bNode (or at least that to be a legal implementation of SPARQL Update). > > It's the round-trip problem for SPARQL results, made to exists solely inside one store, without serializing/deserializing via the result set format. > >> >> 2) The Delete Insert Operation >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_deleteinsertoperation >> >> > anyways relies on the Dataset() function (cf. http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_datasetPattern) >> which "skolemises" bnodes away. before ... > > and which does not need to do that because bNodes have identity, it's just not their label. > > Even if it's done by sk() the definition of that needs tightening up to make it stable across requests. > > That "fresh constant" is not more than the bNode identity. Because sk-1() exists, it is a name for the bNode, (not the thing denoted by the bNode, which is what skolemization does). > > 4.2.4 Dataset(QuadPattern, P, GS ) > "the original bnode labels." > > the labels are a syntax-only feature. > > Another fix needed for sk() is that the "fresh constant" must not collide with any term in a request, nor any future request. Making it something other than a IRI or literal (or bNode!) does this. But then it's exactly treating bNodes as having identity, so just use the bNode itself. > > Easiest fix seems to be to just put "labels" as a syntax feature and explains that labels in syntax are not global or graph store-wide names for bNodes. > > Editorial/major: > > We also need to decide what happens when the same label is used multiple times in one request. There is text for this but it's buried. > > 3.1.1 has : > """ > Blank node labels in QuadDatas are assumed to be disjoint from the blank nodes in the Graph Store and will be inserted as new blank nodes. > """ > > but 3.1.2 has: > """ > Since blank node labels are only unique within each specific context > """ > > what exactly is a 'context'? > > Discussion is only for INSERT DATA and DELETE DATA, not the pattern operations. > >> >> 3) ... I think the Dataset() function of >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_datasetPattern >> can be changed actually from >> >> Dataset(QuadPattern, P, GS ) = Dataset-UNION( sk-1( >> Dataset(QuadPattern, μ) ) | μ in eval(sk(GS)(sk(DG)),P) ) >> >> to >> >> Dataset(QuadPattern, P, GS ) = sk-1 (Dataset-MERGE( >> Dataset(QuadPattern, μ) ) | μ in eval(sk(GS)(sk(DG)),P) ) >> >> without changing of meaning... (again, since bnodes have been >> skolemised away and are only re-introduced via the final sk-1) > > >> 4) In >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_loadoperation >> ... it is actually ok to use Dataset-MERGE, since you don't want to >> reuse bnode-labels coming from an external Graph. >> >> 5) The use of Dataset-UNION() in >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_clearoperation >> can be changed to Dataset-MERGE without altering semantics. > > Just on this, "merge" does not give a stable bNode identification anyway. Merge (graph, dataset) can complete replace every bNode by another (DS has same meaning, but is not the same sets of triples). > >> >> 6) The use of Dataset-UNION() in >> http://www.w3.org/2009/sparql/docs/update-1.1/Overview.xml#def_createoperation >> can be changed to Dataset-MERGE without altering semantics. > > BTW strictly, skolemization does alter semantics. > > http://www.w3.org/TR/rdf-mt/#prf > > """ > a graph should not be thought of as being equivalent to its Skolemization > """ > >> these seem to be all uses of Dataset-UNION(), please let me know if I >> am missing something. > > I prefer the use of bNode identity (at least across the graph store) so that issues of bNodes in two graphs are clear. > >> >> best, Axel >> > Andy >
Received on Wednesday, 20 April 2011 14:20:41 UTC