Re: RDF-ISSUE-17 (graph merge): How are RDF datasets to be merged? [RDF Graphs] from Andy Seaborne on 2011-03-29 (public-rdf-dawg@w3.org from January to March 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Tue, 29 Mar 2011 16:03:45 +0100
To: Steve Harris <steve.harris@garlik.com>
CC: Axel Polleres <axel.polleres@deri.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4D91F4D1.3060904@epimorphics.com>

On 29/03/11 14:59, Steve Harris wrote:
> On 2011-03-29, at 14:35, Andy Seaborne wrote:
>>>   b) The second example that one might think would make sense would be to have ADD not preserving bnodes... what is worrying me here a bit is the fact that graphs in diffferent named graphs may have overlapping bnode labels, and that an ADD (likewise any INSERT that transfers data between graphs in the graph store) may result in unexprected new co-references... example.
>>>
>>>
>>>   graph<a>     _:b1 :p _:b2 .
>>>   graph<b>     _:b2 :p _:b1 .
>>>
>>> Now note that
>>>     ADD<a>   TO<b>
>>> will result in:
>>>
>>>   graph<a>     _:b1 :p _:b2 .
>>>   graph<b>     _:b2 :p _:b1 . _:b1 :p _:b2 .
>>>
>>> that is, bnode labels matter...  since now we have created a coreference in graph<b>   which wouldn't have happended if ADD would rely on MERGE, i.e. where the result would be something like:
>>>
>>>   graph<a>     _:b1 :p _:b2 .
>>>   graph<b>     _:b3 :p _:b4 . _:b5 :p _:b6 .
>>>
>>> Opinions?
>>
>> Operations within the store shoudl not name apart.  Either they already are apart, in which case there is no problem, or they are not, in which case something intentional was done to make it so.
>
> Well... intentional comes by degree.
>
> As a user I'm not sure that I would expect either
>
> ADD<a>  TO<b>
>
> or
>
> INSERT {
>    GRAPH<G2>  { ?x ?y ?z }
> }
> WHERE {
>    GRAPH<G1>  { ?x ?y ?z }
> }
>
> top necessarily copy bnodes across as-is.
>
> It allows you to easily get into a situation where bNodes are shared between>1 graph, which wasn't previous possible with standards, I think.
>
> It's not necessarily a bad thing, but it's also not necessarily expected.
>
> As an implementor I see the internal logic, but I'm, not sure to what degree users see a difference between
>
> LOAD<G1>  INTO<G2>

read from web

>
> ADD<G1>  INTO<G2>

read from graph store

>
>> When ever a file is read, the bNodes don't clash - it takes active measures to have them be the same and that means application intent.
>
> Except if someone uses ADD, or INSERT?

These are active measures - operations within a graph store.  The store 
keeps them apart unless an operation combines them in some way.

	Andy

>
> - Steve
>

Received on Tuesday, 29 March 2011 15:04:25 UTC