Re: shared bnodes (Re: RDF dataset semantics again) from Andy Seaborne on 2012-08-29 (public-rdf-wg@w3.org from August 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 29 Aug 2012 10:39:08 +0100
To: Steve Harris <steve.harris@garlik.com>
CC: public-rdf-wg@w3.org
Message-ID: <503DE33C.9000807@epimorphics.com>

On 29/08/12 10:08, Steve Harris wrote:
> On 2012-08-28, at 17:55, Andy Seaborne wrote:
>> On 28/08/12 16:49, Steve Harris wrote:
>>>> If you don't have a syntax or protocol (such as an RDF API) for constructing graphs with shared bnodes, then, yes, you need to indicate that some kind of unification is desired/appropriate.
>>>>>
>>>>> It seems to me the simple. obvious, and appropriate way to handle this for most use cases is to allow blank node labels to be shared between different parts of a multi-graph document.
>>> It's very easy in the case where you want to indicate that the bNodes are shared - but there is some cost to it - when you want to produce the multi-graph document you need to ensure that the labels for distinct bNodes are kept distinct.
>>>
>>> Consequently you can't do tricks like:
>>>
>>> ( for i in *.ttl; do echo "<$i> {" ; cat $i ; echo "}" ; done ) > foo.trig
>>>
>>> I've never done anything exactly like that, and I have no feel for
>>> how  common a usecase it is, but it's worth noting that in RDF-2004 it would
>>> be "safe", and in RDF 1.1 it might result in shared bNodes, depending on
>>> how lucky you were.
>>
>> @prefixes ?
>
> Sure, you'd need to do something tricksier if your data used a
> fuller  range of Turtle syntax, you could add some grep and grep -v into the mix
> but I'm not going to advocate that people parse Turtle with text
> processing tools.
 >
> Perhaps
>     ( for i in *.nt; do echo "<$i> {" ; cat $i ; echo "}" ; done ) > foo.trig
> would be a better example?

Yes - although the relative URI for the <$i> is not portable.

N-Triples can be turned into N-Quads quite easily ... with the same 
issues of share bNode labels.

> Historically (and currently) what did/does Jena do with bNode labels
> shared between graphs in Trig? Have users ever commented one way or the
> other? I can't remember anyone asking for 4store, but the user base is a
> lot smaller.

Graphs can share bNodes.  This can happen because:

1/ One graph can be a subgraph of another
2/ One graph can be the base data for another graph which has inference.
3/ The app copied a statement from one model to another
    e.g. generalization of graph being related

Because of this, when working with the data, the app may switch between 
which graph it is looking in for the same bNode so we do not rename them 
apart when they move between graphs.

Putting the base data and the inference graph in a SPARQL dataset is 
something that happens.

So shared bNodes matter in Jena.  The TriG parser scopes bNode labels to 
the document (although it is policy driven and the other policies exist 
but are not exposed in the API).

	Andy

>
> - Steve
>

Received on Wednesday, 29 August 2012 09:39:42 UTC