Re: Graphs Design 6.2

On 25 Apr 2012, at 13:57, Sandro Hawke wrote:

> On Wed, 2012-04-25 at 14:45 +0200, Ivan Herman wrote:
>> Hey Sandro,
>> 
>> On Apr 25, 2012, at 13:44 , Sandro Hawke wrote:
>> 
>>> Here's a sketch of 6.2, which is similar to 6.1, but differs in the
>>> areas where people have made me think they didn't like it.  I have not
>>> put it on a wiki page or given it test cases yet.
>>> 
>>> The differences are:
>>> 
>>> * Partial-graph semantics, instead of complete-graph semantics.  This
>>> is more quad-like, and may be seen as more in keeping with RDF's usual
>>> style of working with partial knowledge.   It makes it harder to reason
>>> about what's unsaid, but few people are doing that anyway.  
>>> 
>>> * A keyword "@union" may be given instead of the default graph,
>>> indicating the default graph is the union of all the named graphs.  This
>>> means everything in those graphs is asserted.    (Alternatively, we
>>> could have "@asserted", perhaps parameterized by "all" or the names of
>>> those graphs which are considered asserted.)
>> 
>> 
>> I am not sure I fully understand it. Is it so that I can use @union as part of the declarations meaning that the default graph is everything that is explicitly noted as default graph plus the content of the named graphs? You say 'instead', which seems to suggest that 
>> 
>> @union
>> 
>> { a b c }
>> 
>> is not legal...
> 
> The idea here is that EITHER @union OR the default graph could be given
> in a trig document.   This corresponds to SPARQL's
> http://www.w3.org/TR/sparql11-service-description/#sd-uniondefaultgraph
> and matches how some SPARQL engines (eg 4store) work.

Not really - the union default semantics (used by many SPARQL systems) is global for the whole system, not local to a particular TriG file - in fact quad systems don't have a persistent notion of a TriG file once it's been parsed, it's just a transport mechanism for quads.

The way these systems work is that normally the default graph contains the RDF Union of the named graphs, but you can change that on the fly by using FROM or similar directives.

Having the default graph be the union of the named graphs turns out to be very convenient for dealing with provenance information (otherwise you end up with very complex queries), but it's used in many other situations too. Though sometimes it's lexically convenient to use FROM instead, and give up that behaviour.

Allowing the data format to specify the default graph semantics of the query engine would cause great problems.

> If you want to add a few more triples to the default graph, when using a
> union dataset like this, you put them in another named graph.

It's not a feature of the dataset, it's a feature of the query engine.

Suppose I have two TriG2 files, A.trig and B.trig:

A.trig2:

@union
<G1> { <a> <b> <c> }
<G2> { <b> <d> <d> }

B.trig2:

<G3> { <e> <f> <g> }

And a query engine that's configured to treat the default graph as the union of the named graphs. So, I import A.trig2, and everything's fine, I end up with the following quads:

<G1> <a> <b> <c> .
<G2> <b> <d> <d> .

Now, I go to import B.trig2, but it has no @union directive, so I can either:

1) throw an error, or
2) dump the entire contents of the current store into the default graph, silently change the behaviour of the query engine to explicit default graph mode, and import B.trig as named graphs

Either of those options is going to make users unhappy.

- Steve

>>> 
>>> * A class rdf:GraphAssociate containing all the things denoted by RDF
>>> terms used as labels in datasets.     The label is an IRI or bNode, the
>>> "associate" is the thing that IRI or bNode denotes.   The associate is
>>> associated with the given graph.  This is a superclass of rdf:Graph,
>>> because graphs have themselves as associates.   (I wouldn't mind a
>>> better word, but haven't thought of one.)  
>>> 
>>> * A class rdf:GraphContainer, a subclass of rdf:GraphAssociate.  A
>>> GraphContainer differs from a Graph in that conceptually it can change
>>> over time.   [We don't say anything about how to deal with it changing
>>> over time, because (so far) RDF never talks about change-over-time.  If
>>> it did (such as with rdf:starting and rdf:ending predicates) then that
>>> solution would apply here as well.]   The trig document "{ <u> a
>>> rdf:GraphContainer} <u> { <a> <b> <c> }" is true at exactly those times
>>> that the Graph Container identified by "u" contains the triple expressed
>>> as "<a> <b> <c>".    [Note well: I did not say "contains ONLY" that
>>> triple.  Because of partial-graph semantics, the document is also true
>>> if <u> also contains some other triples.]
>> 
>> Does it also mean that dereferencing <u> through HTTP would return a serialization of a graph containing (<a> <b> <c>)? (At the moment, there is nothing about that for rdf:Graph.)
> 
> That's how I would read Web Architecture, yes.   I'm not sure if we want
> to go there explicitly or not.   I'm not sure we can forbid people using
> rdf:Graph this way. 
> 
>>> The rest of 6.1 remains the same, including global-scope bNode labels,
>>> bNodes allowed as graph labels, rdf:Graph, and rdf:hasGraph.   (I have
>>> an idea for 6.3, but I don't have time to think it through before
>>> today's meeting.)
>> 
>> Let us keep to one number a week:-)
> 
> Yeah...    But 6.3 was going to address this Graph vs GraphContainer
> point.  :-)
> 
>     -- Sandro
> 
>> Ivan
>> 
>>> 
>>>   -- Sandro
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian 
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ

Received on Wednesday, 25 April 2012 18:33:44 UTC