Questions on grammar restrictions on Blank Node reuse across patterns and a more fundamental question on Update semantics for confirmation (was: Re:Draft response KK-15)

Now for the question that came to my mind in relation to 
the response to comment KK-15. Sorry it's a bit lengthy...
 
In SPARQL 1.1 Query, we say in several places that blank node labels 
may only be used in a single graph pattern in the query pattern, e.g.
http://www.w3.org/TR/sparql11-query/#bgpBNodeLabels
http://www.w3.org/TR/sparql11-query/#grammarBNodeLabels
 
The situation is not so clear for me in QuadPatterns in constructs.
 My understanding would be, that the restriction on not allowing reuse 
of blank node labels should also hold across different QuadPatterns 
as it is across basic grap patterns.
 
i.e. for instance
 
  INSERT DATA { GRAPH <g1> { _:b1 :p :o} GRAPH <g2> { _:b2 :p :o} } 

would be ok, but 
 
  INSERT DATA { GRAPH <g1> { _:b1 :p :o} GRAPH <g2> { _:b1 :p :o} } 

wouldn't, but I am not entirely sure.
 Do we *need* to restrict this? 
 Do we *want* to restrict this?
 
My guts feeling is that 
 
  INSERT DATA { GRAPH <g1> { _:b1 :p :o} GRAPH <g2> { _:b1 :p :o} }

looks weird and shouldn't be allowed. On the other hand, it doesn't seem to do harm either, since there is seemingly no correlation between even same-labelled blank nodes across graphs... i.e. it wouldn't be the "same" blank node, on the other hand it would seem more clear cut to forbid it, even with this understanding in mind, to be non-confusing for users.


However, even if we disallow that, I am afraid it doesn't end here, so please *read on*:

In our current semantics, the statement  that "there is no correlation between even same-labelled blank nodes across graphs" is - as I understand it - not quite true, see next:
 
The following example explains what makes my head spin now...
Take:
 
Let your dataset consist of an empty default graph and two named graphs g1, g2 as follows:
GS = {}
     <g1> { _:b :p :o }
     <g2> {}
 
Now, let's write:
 
  INSERT { GRAPH <g2> { ?S ?P ?O } }
  WHERE { GRAPH <g1> {?S ?P ?O } }
 
which would result in the following:

GS' = {}
     <g1> { _:b :p :o }
     <g2> { _:b :p :o }

Now, however, what does a subsequent second repitition of 

  INSERT { GRAPH <g2> { ?S ?P ?O } }
  WHERE { GRAPH <g1> {?S ?P ?O } }

result into?

According to the current semantics definition, my understanding it that 
this still results in:

GS'' = {}
     <g1> { _:b :p :o }
     <g2> { _:b :p :o }

Now, while this is against the possible intuition mentioned above that 
there is no correlation between even same-labelled blank nodes across 
graphs, it confirms the intuition that copying the same graph to itself 
is an idempotent operation, which - within a graphstore would IMO makes 
sense. What do the current implementations do with this?

The alternative would be that the result of two subsequent applications 
of this update would be:

GS''' = {}
     <g1> { _:b :p :o }
     <g2> { _:b1 :p :o , _:b2 :p :o }

But note that this would need, I am afraid, a significant change in 
the update document. I do rather *not* intend to make any changes in 
the semantics at this point, but would prefer to make clear/confirm 
that the intended outcome is GS''.
Particularly, as saind above, I think that the semantics should be 
GS'' in order to make copying a graph on itself twice idempotent.

Still, I want to ask for clarification to the group on the following two things:

1) Do we want to forbid "sharing bnodes across" across QuadPatterns in INSERT and INSERT 
   DATA?

2) Do we agree that the semantics of 

  update1 =  INSERT { GRAPH <g2> { ?S ?P ?O } }
             WHERE { GRAPH <g1> {?S ?P ?O } } ;
             INSERT { GRAPH <g2> { ?S ?P ?O } }
             WHERE { GRAPH <g1> {?S ?P ?O } }

  should be GS'' (and not GS''')?

If we agree, I suggest to add the following points to proceed:

* Depending on 1) add
   INSERT DATA { GRAPH <g1> { _:b1 :p :o} GRAPH <g2> { _:b1 :p :o} }
  as either a negative or positive syntax test, plus add some clarifying
  remark on reuse of bnodes across QuadPatterns in the Update doc.
* Add the restriction on blank node label usages across BGPs
  (and QuadPatterns? Depending on 1)) also to the grammar restrictions
  in query, i.e. at http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#sparqlGrammar
* Add the example of update1 from above to the update test cases, with outcome GS''.

(Note that I still need to re-check the current test cases whether something similar is covered already, if someone can point me, thanks!)


Best,
Axel

--
Dr. Axel Polleres
Siemens AG Österreich
Corporate Technology Central Eastern Europe Research & Technologies CT T CEE 
 
Tel.: +43 (0) 51707-36983
Mobile: +43 (0) 664 88550859
Fax: +43 (0) 51707-56682 mailto:axel.polleres@siemens.com 

Received on Thursday, 24 May 2012 12:21:00 UTC