W3C home > Mailing lists > Public > public-rdf-wg@w3.org > April 2012

Re: Union or not union for the default graph...

From: Steve Harris <steve.harris@garlik.com>
Date: Fri, 13 Apr 2012 15:11:28 +0100
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-wg@w3.org
Message-Id: <ED63FF24-B3F2-42AE-B72F-2AD6856890AD@garlik.com>
To: Ivan Herman <ivan@w3.org>
On 2012-04-13, at 12:08, Ivan Herman wrote:

> Andy,
> 
> On Apr 13, 2012, at 12:43 , Andy Seaborne wrote:
> 
>> 
>> 
>> On 13/04/12 08:02, Ivan Herman wrote:
>>> My problem with this is that it becomes a closed possibility provided
>>> by the store, and not the choice of the dataset provider. I mean: the
>>> SPARQL service description tells me about the default dataset at the
>>> SPARQL endpoint. Is there a way to tell the SPARQL engine to use or
>>> not to use the union of the graphs that are in a specific SPARQL
>>> query>
>>> 
>>> What am looking for is a way to tell the system: this is what I want.
>>> Do I want a quoting or a union semantics for my particular dataset?
>> 
>> Aside from the efficiency, put all the triples from the named graphs into the default graph.  That tells system :-)  Seriously this feels like a system config issue, not an architectural issue.
>> 
>> David write:
>>>> Perhaps we should provide a standard way for an RDF system
>>>> to advertise how the default graph.
>> 
>> and that is what SPARQL SD does for SPARQL endpoints.
>> 
>> 
>> So if the store does not do what the dataset provider wants, they ought to go to a different store. Many systems offer one mode or the other, not a choice, and it goes deep, right down to the bytes-on-disk.
>> 
>> So the data publisher makes the data user an offer via the endpoint and service description.
>> 
>> If the data user wants something different, don't take the offer made by the publisher.  The data user is now taking the responsibility for the meaning of the default graph.  So it might have to pull the named graph data into a local system that does what it wants.
>> 
>> Some systems allow you to "name" the union-of-named-graphs, as a virtual graph (without recursion!).  That's not standard; it's a bit of a hack of naming.
>> 
>> In practice, there can more choices: pick a few graphs out of a large collection and query over the union of those.  pick all but certain graphs or intersections or ...  Once you formalise the union-default-graph, it seems a small step to allowing other combinations as well.
>> 
>> 
>> There is an efficiency thing: the data publisher might wish to say "and by the way, make default graph the union" in a TriG simply to avoid repeating all the triples if written out long form.
> 
> Is there some accepted syntax for something like that? Or do we have to come up with this?

Except in the special case where a SPARQL store contains one-and-only-one TriG file, I don't see how that would be implementable - what if you have one TriG file in there which specifies that the default graph contains the union of the named graphs, and one which explicitly specifies a default graph? 

If you ran SELECT * WHERE { ?s ?p ?o } - what would you get back? I don't believe any real SPARQL systems actually implement union-of-named-graphs by storing two copies of each triple.

I guess you could give an error if someone tried to import two files with different policies, but like Andy says, it seems like attacking the problem from the wrong end.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
Received on Friday, 13 April 2012 14:12:10 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:04 UTC