Re: Scope of blank nodes in TriG? from Andy Seaborne on 2011-10-14 (public-rdf-wg@w3.org from October 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Fri, 14 Oct 2011 12:55:00 +0100
To: public-rdf-wg@w3.org
Message-ID: <4E982314.5070907@epimorphics.com>
On 14/10/11 12:35, Ian Davis wrote:
> On Fri, Oct 14, 2011 at 12:17 PM, Steve Harris<steve.harris@garlik.com>  wrote:
>>
>> On 2011-10-14, at 12:12, William Waites wrote:
>>
>>> On Fri, 14 Oct 2011 11:57:43 +0100, Ian Davis<ian.davis@talis.com>  said:
>>>
>>>     iand>  I think  it does matter. What was the N-Quads document that
>>>     iand>  was thew result of the conversion.
>>>
>>> I just looked up the mime type for trig, and did it directly with the
>>> same results. The reason I thought it wouldn't matter is because I
>>> suspected (correctly) that both documents would get parsed into the
>>> same internal represenatation by raptor which is what actually gets
>>> put into the store.
>>
>> Somewhat off topic, but from memory I don't think that's correct.
>>
>> I think Trig generates graph change callbacks, and N-Quads emits quads.
>>
>> 4store (apparently) treats them the same, but it doesn't have to.
>>
>> - Steve
>
>
> I think the TriG spec could be updated by the authors so blank nodes
> are scoped to the document rather than the graph. It just seems like a
> bug in the spec to me.

Agreed.  Blank node labels scoped to the document is fairly important to 
us.  We have graphs calculated from other graphs e.g subgraph, rule 
results / inference results, so the process is assuming more knowledge. 
  Union default graphs and using named graphs for data management also 
are better with document scoping.

The alternatives of being banning label reuse across graphs (SPARQL does 
this across BGPs)) seems merely a block on reasonable uses and  graph 
scoping label reuse seems at risk of confusion.

The default bahaviour in Jena/RIOT is document-scoped labels.  It takes 
some low-level reconfiguration to change and it isn't documented for users.

> This WG probably ought to decide on the semantics of blank nodes
> appearing in multiple graphs in a dataset.

Agreed as well - while related, lets keep the syntax issues in TriG 
separate from the general semantics of bNodes across graphs in a dataset.

Having TriG and NQuads follow the same scoping rules should be the 
default choice - it needs a good reason to change it - because it's nice 
to have TriG understandable as TriG -> NQuads -> storage

> There is a material difference between the bahaviour of parsing the
> single nquads file into a dataset:
>
> _:bnode<http://example.com/p>  "foo"<http://example.com/g1>  .
> _:bnode<http://example.com/p>  "bar"<http://example.com/g2>  .
>
> and parsing two separate nquads files:
>
>
> file1.nq:
>
> _:bnode<http://example.com/p>  "foo"<http://example.com/g1>  .
>
>
> file2.nq:
>
> _:bnode<http://example.com/p>  "bar"<http://example.com/g2>  .
>
>
> There is also the case to consider where the following ntriples file
> is found at<http://example.com/g1>
>
> _:bnode<http://example.com/p>  "foo" .
>
> and the following is found at<http://example.com/g2>
>
> _:bnode<http://example.com/p>  "bar" .
>
>
>
> Ian

	Andy
Received on Friday, 14 October 2011 11:55:37 UTC