Re: Reification and Provenance modelling from Richard Cyganiak on 2011-09-20 (public-rdf-comments@w3.org from September 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 20 Sep 2011 16:16:50 +0100
To: Bob Ferris <zazi@smiy.org>
Cc: public-rdf-comments@w3.org
Message-Id: <A1ECE63C-68C7-4B68-97A1-9138FDEEEA44@cyganiak.de>
On 20 Sep 2011, at 11:08, Bob Ferris wrote:
> Just imagine a triple store full of single-triple graphs. Querying this triple store might really getting complex, or?

You're talking about querying with SPARQL? This is a bit out of scope here as we can't change SPARQL anyways, but I don't see how there's a difference in query complexity between single-triple graphs and statement identifiers. I would assume that the default graph contains all triples regardless of their named graph. Then a statement identifier approach could be queried like this:

SELECT * WHERE {
   TRIPLE ?t { ?s ?p ?o }
   ...
}

(You could perhaps tweak the syntax to shave off a few characters.)

And the single-triple graphs can be addressed like this:

SELECT * WHERE {
   GRAPH ?g { ?s ?p ?o }
   ...
}

Verbosity aside, I don't see a difference in complexity.

> I guess, nobody really wants to isolate single triples in separate graphs, or?

Well, apparently some people want triple-level metadata, and named graphs support triple-level metadata.

>>> The simple graph literals proposal [4] looks a bit more elegant, however, these graphs have still no identifier (from my POV).
>> 
>> Why is this a problem? Note that you can make statements about them.
> 
> To make statements about them somewhere else we usually need an identifier to refer to them, or?

No, because graphs are literals, so one can repeat the literal to make statements about it. Occurrences of the same literal in different graphs are semantically equivalent (unlike, say, blank node identifiers).

>>> All these proposals cannot deal with the "Slicing datasets according to multiple dimensions" [5].
>> 
>> I don't think that's true. The same triple can exist in multiple graphs. Nothing stops a triple store from providing different views on the same set of triples.
> 
> Yes, of course. However, in the existing proposals we would simply duplicate the data

You keep saying that but I don't think it's true. RDF graphs and named graphs are abstract data models, and implementers are free to store them any way they want internally, including space-efficient storage. Quoting:

[[
This abstract syntax is the syntax over which the formal semantics are defined. Implementations are free to represent RDF graphs in any other equivalent form.
]] – http://www.w3.org/TR/2011/WD-rdf11-concepts-20110830/#section-Graph-syntax

So whether data is duplicated internally, or whether a storage scheme is used that internally uses triple identifiers and represents graphs as list of those, is entirely up to the implementation.

(Think of SQL views. In the abstract relational model, they contain data – but that data is merely computed on demand from underlying base tables. Implementations *may* materialize the view or an index over the view to speed up queries, but that doesn't mean that the view model forces the duplication of data. Named graphs could be views on other graphs in the same dataset.)

> Well, I guess that I outlined already the disadvantages of these proposals (at least from my POV),

You mentioned two things, as far as I can see:

1. named graphs don't deal well with single-triple graphs
2. having the same triple in multiple graphs is not space-efficient

Regarding #1, you haven't shown anything to back that up. I'm still trying to understand what the perceived problem with single-triple named graphs is. You have not explained the problem besides saying that you don't like the approach, which – with all due respect – I find not compelling as an argument.

Regarding #2, it's probably false because the RDF abstract syntax does not constrain implementations, and I'm unconvinced that an optimized implementation of your scheme would actually be more space-efficient than an optimized implementation of named graphs.

>>> - statements can be utilised in multiple graphs
>> 
>> This is possible in [1].
> 
> Via importing single-triple graphs into other graphs? (This looks somehow artificial to me, sorry).

No, by having two graphs that contain the same statement.

> However, I believe that there is a strong antipathy for single-triple graphs.

This is not a technical argument.

Best,
Richard
Received on Tuesday, 20 September 2011 15:17:19 UTC