Re: shared bnodes from Sandro Hawke on 2012-08-24 (public-rdf-wg@w3.org from August 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 24 Aug 2012 11:52:13 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: public-rdf-wg@w3.org
Message-ID: <5037A32D.8000606@w3.org>
On 08/24/2012 07:15 AM, Andy Seaborne wrote:
>
>
> On 24/08/12 11:39, Richard Cyganiak wrote:
>> On 24 Aug 2012, at 11:19, Antoine Zimmermann wrote:
>>> So, I'm saying that you can just forget about trying to identify
>>> equal bnodes across graph, and simply rely on a local identifier in
>>> one graph, a local identifier in another graph, and tell with a
>>> dedicated mechanism that these two locally identified bnodes are
>>> assumed to be the same.
>>
>> I think I'm +1 to this sentiment.
>>
>> So far, the only argument in favour of allowing shared bnodes that I
>> can recall was to manage inferred triples in a separate graph.
>

I think almost any kind of from-RDF and to-RDF data processing is likely 
to need this, if it's working with data that includes blank nodes.     
You might call that all "inference", but if so, it's quite broad.

> And sub/union graphs in general.
>
> Union graphs for those systems that already make one graph the union 
> of all others.  Whether we like it or not, those systems are common, 
> even maybe even the majority, and have been for several years.
>
> It is the compromise of the context point-of-view and the 
> multiple-graphs point-of-view.  In the context POV,
>
> (this is not advocacy, more like 'history')
>

agreed.

To put that slightly differently: shared bnodes are also required for 
the SPARQL dump & restore use case.

>>
>> This is a valid and compelling use case. But an equally valid use
>> case would be to manage all the inferred triples in another
>> *dataset*. So, if I can infer some triples from graph g1 in dataset
>> DS, then I can just store these triples in a graph named g1 in
>> dataset DS_inference.
>
> The use case includes being able to have one "thing" which as both 
> base and inferred information in it.
>
>>
>> Can datasets share bnodes?
>>
>> The whole bnode sharing thing brings a lot of complexity with it. In
>> a complete graph store management language, I need operations for
>> "copy graph with bnodes intact", "copy graph with fresh bnodes", and
>> so on. Most users won't understand the difference and it will just
>> add to the general sense of bewilderment that surrounds bnodes.
>>
>> I say let's simplify things for once and disallow bnode sharing
>> between graphs. The use case above can still be addressed via skolem
>> IRIs.
>
> Shared bnodes are already a feature "out there".
>
> (can't say I ever see the "copy graph with fresh bnodes" arise)
>

I remember struggling with this in writing my first RDF database in 2001 
-- trying to decide whether to allow blank nodes to be shared between 
g-boxes.   I haven't really kept track, but my impression is nearly all 
APIs I've seen since then have allowed it.

So I don't think forbidding it in the model would be okay, in terms of 
respecting existing deployments.   Forbidding it in TriG is probably 
okay on that front, but would be a hassle for the inference use case, 
and ... I think ... would fail the SPARQL dump/restore use case.

         -- Sandro
>>
>> Best, Richard
>>
>
>     Andy
>
>
Received on Friday, 24 August 2012 15:52:31 UTC