W3C home > Mailing lists > Public > public-rdf-wg@w3.org > April 2012

Re: Labelled graphs

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 25 Apr 2012 20:05:11 +0100
Cc: public-rdf-wg WG <public-rdf-wg@w3.org>
Message-Id: <F14553F9-C14B-4243-9035-23D9C53B5FDB@garlik.com>
To: Sandro Hawke <sandro@w3.org>

On 25 Apr 2012, at 11:51, Sandro Hawke wrote:

> On Wed, 2012-04-25 at 11:08 +0100, Steve Harris wrote:
>> On 24 Apr 2012, at 13:04, Sandro Hawke wrote:
>>> 
>>>>> * When the same label is used multiple times in the same dataset, the
>>>>> graph is
>>>>> assumed to be the union of the graphs labeled with it
>>> 
>>> This is the "partial-graph semantics" view, which I can live with, but
>>> some people have expressed opposition.  We should probably try some
>>> straw polling on it.
>> 
>> The choice here needs to be made carefully, to avoid unintended consequences on implementations, and data generation processes.
>> 
>> The corner cases are around bNodes (aren't they always), e.g.
> 
> I believe you're talking about a different issue here.  The question of
> the scope of bNode labels comes up whether we have partial- or
> complete-graph semantics.

I only care about the scope of the labels.

You can "copy" bNodes from graph to graph using SPARQL Update (as I understand it), but it's the labels that are the issue.

> 6.1 says the scope of bNode labels is the document (or dataset, I
> suppose).  I know that gives you a memory cost, but it's important for
> several use cases, such as Keeping Inferred Triples Separate.

That seems counter-intuative to me. I see that http://www.w3.org/2011/rdf-wg/wiki/Graphs_Design_6.1#Blank_Nodes first test illustrates this, but it provides no rationale.

The downside is that you have to be more careful when constructing your TriG files, as well as making the parse process more expensive. It's probably not a huge issue for us, as we don't use TriG for bulk transfer anyway.

> There are some SPARQL test cases for this here:
>        http://www.w3.org/2011/rdf-wg/wiki/Graphs_Design_6.1#Blank_Nodes
> 
> I don't think you can test for it with trig entailment unless you have a
> way to get at the triples inside the named graphs and exposing them to
> RDF semantics.  Folks have been proposing doing that by flagging the
> dataset as a default-is-union dataset; if you can do that, then you
> could ask:
> 
>        Does
>                @default-is-union
>                <u1> { _:x <b> <c> }
>                <u2> { _:x <b> <d> }
>        entail
>                { _:y <b> <c>,<d> }
> 
> I claim the answer should be "yes".

That example could be true for so many reasons that it's hard to answer meaningfully.

- Steve

>> <G1> {
>>  <a> <b> _:b1 .
>> }
>> ...
>> <G1> {
>>  <c> <d> _:b1 .
>> }
>> ...
>> <G2> {
>>  <c> <d> _:b1 .
>> }
>> 
>> Is that one, two, or three bNodess, an error, undefined, or...?
> 
> Under 6.1 with complete- or partial- graph semantics, it's one bNode.
> Under complete-graph semantics it's an inconsistent dataset.
> 
>    -- Sandro
> 
>> If it's an RDF Union between graphs then there's one bNode, between graphs with the same label, then there's two, if it's a Merge, then there's three (I believe).
>> 
>> Internally our systems maintain a map from bNode labels to internal skolem constants when parsing (noting that not all systems do this, but many do), and it would be good to be able to discard that map when we hit a "}" token.
>> 
>> If we have either kind of union semantics that map can get extremely large when parsing a large TriG file, and more to the point you have to maintain a set of maps for all graphs in the document, just incase the graph is mentioned again further down the document.
>> 
>> - Steve
>> 
>>>>> The appears to be in line to the 6.1 design, with some
>>>>> modifications/specializations.
>>> 
>>> I wonder if we can't adopt something close to 6.1, close pretty much all
>>> the open GRAPHS issues, then open a few new ones, like
>>> partial-vs-complete-graph semantics and whether/how to define
>>> GraphContainer.
>>> 
>>>   -- Sandro
>>> 
>>>>> Guus
>>>> 
>>>> (sorry for the delay - was not at work)
>>>> 
>>>> Guus - nice summary.
>>>> 
>>>> 	Andy
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian 
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ
Received on Wednesday, 25 April 2012 19:05:50 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:04 UTC