Scope of blank nodes in SPARQL? from Alex Hall on 2011-10-18 (public-rdf-wg@w3.org from October 2011)

From: Alex Hall <alexhall@revelytix.com>
Date: Tue, 18 Oct 2011 11:20:50 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
Message-ID: <CAFq2biwKeuL0ozOGTSUJtx-UYpoh4s3cbtyAqxN03m0L8esXVA@mail.gmail.com>
On Fri, Oct 14, 2011 at 2:41 PM, Andy Seaborne <
andy.seaborne@epimorphics.com> wrote:

>
>
> On 14/10/11 16:37, Alex Hall wrote:
> ...
>
>  Question for the SPARQL folks: given the following sequence of operations:
>>
>> INSERT DATA { GRAPH <g1> { _:s <p> <o> } }
>>
>
> That "_:s" is handled by parsing to be fresh bNode.
>
>
>  INSERT { GRAPH <g2> { ?s <p> <o> } } WHERE { GRAPH <g2> { ?s <p> <o> } }
>>
>                                       WHERE { GRAPH <g1> { ....


You're right -- typo on my part.  Good catch.


>
> >
>
>> Does SPARQL take any position on whether the blank node inserted into g2
>> is the same as or different from the one from g1?
>>
>
> Alex - you can check my reply with an editor of the SPARQL Update document
> (say, @quoll for example...)
>
> Rather important question so I went and had a look at the details ...
>
> INSERT [1] is based on Dataset-UNION [2] of a Dataset(QuadPattern) [3]
>
> It's all done with "union" (not "merge") of triples
>
> For each solution to the WHERE clause:
>
> skμ(Template) is the function that names apart the bNodes actually
> mentioned in the pattern, then template is instantiated for the solution and
> union is used to make the graph store changes.
>
> tl;dr: yes, it's the same blank node
>

This analysis looks right.  For me, the clincher is the part of [4]
(defining the dataset constructed from a graph pattern and an RDF Dataset)
which reads, "here the scoping graph SG used for BGP matching is equal to
the active graph, i.e., blank nodes from the active graph are preserved in
solutions."

Now, a follow-up question:

Given a store containing two graphs with the following statements:

<g1> = { _:s <p1> "foo". }
<g2> = { _:s <p2> "bar". }

Assume that _:s here denotes the same blank node shared between the graphs
(e.g. was inserted into one graph using an INSERT operation as illustrated
above).  This is a common situation in the case where <g2> is an inference
graph that holds entailed statements computed by applying forward-chaining
rules to <g1>.  How can I query the union of those two graphs in a way that
a variable can match the blank node in both graphs?

In other words, I'd like to do say something like:

SELECT ?o1 ?o2
FROM <g1>
FROM <g2>
WHERE { ?s <p1> ?o1 . ?s <p2> ?o2 }

and find a single solution, { ?o1="foo", ?o2="bar" }.  I suspect that many
(most?) stores will give the result that I'm looking for in this situation
-- I know Mulgara will.  But strictly speaking, the default graph for this
query is found by taking the merge of all graphs mentioned in a FROM clause,
which implies renaming of shared blank nodes.  In this case, I want the
union of those graphs, not the merge; is there any way of getting that
without relying on store-specific implementation details?

I imagine that there are historical reasons why merge is specified here and
not union, but it would be really nice if stores had license to do a union
in the case where they have specific knowledge that a blank node identifier
shared between the graphs does in fact denote a common resource.

-Alex



>
>        Andy
>
> [1] http://www.w3.org/TR/sparql11-**update/#def_**deleteinsertoperation<http://www.w3.org/TR/sparql11-update/#def_deleteinsertoperation>
> [2] http://www.w3.org/TR/sparql11-**update/#def_datasetUnion<http://www.w3.org/TR/sparql11-update/#def_datasetUnion>
> [3] http://www.w3.org/TR/sparql11-**update/#def_datasetQuadPattern<http://www.w3.org/TR/sparql11-update/#def_datasetQuadPattern>


[4] http://www.w3.org/TR/sparql11-update/#def_datasetPattern
Received on Tuesday, 18 October 2011 15:21:21 UTC