Re: Questions on grammar restrictions on Blank Node reuse across patterns and a more fundamental question on Update semantics for confirmation (was: Re:Draft response KK-15)

On 24/05/12 13:20, Polleres, Axel wrote:
> Now for the question that came to my mind in relation to
> the response to comment KK-15. Sorry it's a bit lengthy...
>
> In SPARQL 1.1 Query, we say in several places that blank node labels
> may only be used in a single graph pattern in the query pattern, e.g.
> http://www.w3.org/TR/sparql11-query/#bgpBNodeLabels
> http://www.w3.org/TR/sparql11-query/#grammarBNodeLabels
>
> The situation is not so clear for me in QuadPatterns in constructs.
>   My understanding would be, that the restriction on not allowing reuse
> of blank node labels should also hold across different QuadPatterns
> as it is across basic grap patterns.
>
> i.e. for instance
>
>    INSERT DATA { GRAPH<g1>  { _:b1 :p :o} GRAPH<g2>  { _:b2 :p :o} }
>
> would be ok, but
>
>    INSERT DATA { GRAPH<g1>  { _:b1 :p :o} GRAPH<g2>  { _:b1 :p :o} }
>
> wouldn't, but I am not entirely sure.
>   Do we *need* to restrict this?
>   Do we *want* to restrict this?

No and no. Where "we" = "I"

This should be legal - it is inserting a share bNode which we allow

INSERT DATA does not take a quad *pattern* -- it's quad data.

> My guts feeling is that
>
>    INSERT DATA { GRAPH<g1>  { _:b1 :p :o} GRAPH<g2>  { _:b1 :p :o} }
>
> looks weird and shouldn't be allowed. On the other hand, it doesn't seem to do harm either, since there is seemingly no correlation between even same-labelled blank nodes across graphs... i.e. it wouldn't be the "same" blank node, on the other hand it would seem more clear cut to forbid it, even with this understanding in mind, to be non-confusing for users.

Allow.

> However, even if we disallow that, I am afraid it doesn't end here, so please *read on*:
>
> In our current semantics, the statement  that "there is no correlation between even same-labelled blank nodes across graphs" is - as I understand it - not quite true, see next:
>

Where is that quoted text?

I can't find "correlated" in SPARQL Update or SPARQL Query.


> The following example explains what makes my head spin now...
> Take:
>
> Let your dataset consist of an empty default graph and two named graphs g1, g2 as follows:
> GS = {}
>       <g1>  { _:b :p :o }
>       <g2>  {}
>
> Now, let's write:
>
>    INSERT { GRAPH<g2>  { ?S ?P ?O } }
>    WHERE { GRAPH<g1>  {?S ?P ?O } }
>
> which would result in the following:
>
> GS' = {}
>       <g1>  { _:b :p :o }
>       <g2>  { _:b :p :o }
>
> Now, however, what does a subsequent second repitition of
>
>    INSERT { GRAPH<g2>  { ?S ?P ?O } }
>    WHERE { GRAPH<g1>  {?S ?P ?O } }
>
> result into?

No change - the real and actual bnode is the same.  Shared bNodes (which 
are essential for subgraphs).

> According to the current semantics definition, my understanding it that
> this still results in:
>
> GS'' = {}
>       <g1>  { _:b :p :o }
>       <g2>  { _:b :p :o }

Good.

Bnodes are real objects you can manipulate.  They have an internal 
unique identifier.  All this _:a business is merely a label in a file 
and systems produce an internal identifier for them (even the ones that 
claim otherwise have a pointer to a datastructure so that is the 
internal name).

> Now, while this is against the possible intuition mentioned above that
> there is no correlation between even same-labelled blank nodes across
> graphs, it confirms the intuition that copying the same graph to itself
> is an idempotent operation, which - within a graphstore would IMO makes
> sense. What do the current implementations do with this?
>
> The alternative would be that the result of two subsequent applications
> of this update would be:
>
> GS''' = {}
>       <g1>  { _:b :p :o }
>       <g2>  { _:b1 :p :o , _:b2 :p :o }

-1

>
> But note that this would need, I am afraid, a significant change in
> the update document. I do rather *not* intend to make any changes in
> the semantics at this point, but would prefer to make clear/confirm
> that the intended outcome is GS''.
> Particularly, as saind above, I think that the semantics should be
> GS'' in order to make copying a graph on itself twice idempotent.

Yes (although the reason is deeper than just idempotency).

>
> Still, I want to ask for clarification to the group on the following two things:
>
> 1) Do we want to forbid "sharing bnodes across" across QuadPatterns in INSERT and INSERT
>     DATA?

I don't want to forbid it.

>
> 2) Do we agree that the semantics of
>
>    update1 =  INSERT { GRAPH<g2>  { ?S ?P ?O } }
>               WHERE { GRAPH<g1>  {?S ?P ?O } } ;
>               INSERT { GRAPH<g2>  { ?S ?P ?O } }
>               WHERE { GRAPH<g1>  {?S ?P ?O } }
>
>    should be GS'' (and not GS''')?

Yes

>
> If we agree, I suggest to add the following points to proceed:
>
> * Depending on 1) add
>     INSERT DATA { GRAPH<g1>  { _:b1 :p :o} GRAPH<g2>  { _:b1 :p :o} }
>    as either a negative or positive syntax test, plus add some clarifying
>    remark on reuse of bnodes across QuadPatterns in the Update doc.

Positive test.

> * Add the restriction on blank node label usages across BGPs
>    (and QuadPatterns? Depending on 1)) also to the grammar restrictions
>    in query, i.e. at http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#sparqlGrammar

No.

> * Add the example of update1 from above to the update test cases, with outcome GS''.

Not necessary but if you want to fine.  I think examples of corner cases 
are unhelpful - concentrate on the important parts, examples do not 
cover everything.

> (Note that I still need to re-check the current test cases whether something similar is covered already, if someone can point me, thanks!)
>
>
> Best,
> Axel
>
> --
> Dr. Axel Polleres
> Siemens AG Österreich
> Corporate Technology Central Eastern Europe Research&  Technologies CT T CEE
>
> Tel.: +43 (0) 51707-36983
> Mobile: +43 (0) 664 88550859
> Fax: +43 (0) 51707-56682 mailto:axel.polleres@siemens.com

Received on Thursday, 24 May 2012 13:05:45 UTC