Re: SPARQL, named graphs and default graph from Chimezie Ogbuji on 2006-09-13 (public-sparql-dev@w3.org from July to September 2006)

From: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
Date: Wed, 13 Sep 2006 12:44:08 -0400 (EDT)
To: Nuutti Kotivuori <naked@iki.fi>
cc: public-sparql-dev@w3.org, eikeon@eikeon.com
Message-ID: <Pine.GSO.4.60.0609131212540.4070@joplin.bio.ri.ccf.org>

On Wed, 13 Sep 2006, Nuutti Kotivuori wrote:

> I think that in librdf, there are statements explicitly without a
> context. In SPARQL queries, the default graph is the merge of all
> statements in the store, with or without a context. Queries which
> explicitly match the graph in a variable never match statements
> without a context. And so there is no easy way to match all the
> statements without a context only.
>
> I'd like to know atleast how rdflib and Jena (with whatever extensions
> that this requires) solve this issue.

RDFLib has two API's: a Store API and a Graph API.  Every Graph (there 
are several kinds: QuotedGraphs, ConjunctiveGraphs, Named Graphs, 
AggregateGraphs, ..) is associated with a Store instance and an 
identifier. The identifiers are either a Blank Node or a URI.

All the Store API's take a fourth parameter which is the containing Graph 
(even the __len__ method). So, theoretically the Store can choose to 
persist RDF triples in a flat space (i.e., vanilla RDF model) and disregard the fourth parameter or use 
the identifier of the containing graph to partition its persistence space 
accordingly - it can even choose to partition formulae seperately (to 
support N3 persistence) from the kind of Graph passed down to it (it will 
recieve QuotedGraph instances as the fourth parameter in this case).

The Store.triples method returns a generator of (s,p,o), graphInst so each 
Store implementation is expected to be able to associate each triple with 
a containing graph (or None if the Store chooses to persist triples in a 
flat space).

The Graph API's do most of the leg work of named graph aggregation. 
ConjunctiveGraph is an (unamed) aggregation of all the named graphs within 
the Store.  It has a 'default' graph, whose name is associated with the 
ConjunctiveGraph throughout it's life.  All methods work against this 
default graph.  Its constructor can take an identifier to use as the name 
of this 'default' graph or it will assign a BNode.  In practice (at least 
how *I* use RDFLib), I instanciate a ConjunctiveGraph if I want to add 
triples to the Store but don't care to mint a URI for the graph (the 
scenario which triggered this thread).  These triples can still be 
addressed.

ReadOnlyGraphAggregate is a subset of the ConjunctiveGraph where the names 
of the graphs it provides an aggregate view for are passed on in the 
constructor - this is how a SPARQL query with multiple FROM NAMED is 
supported.

QuotedGraphs are meant to implement Notation 3 formulae.  They are 
associated with a required identifier that the N3 parser must provide in 
order to maintain consistent formulae identification for scenarios such as 
implication and such.

The default dataset for SPARQL queries is equivalent to the Graph instance 
on which the query is dispatched.  If the .query method is called on a 
ConjunctiveGraph, the default dataset is the entire Store, if its a named 
graph its the named graph.

This setup supports:

- Flat space of triples
- Named Graph partitioning
- Notation 3 persistence

Chimezie Ogbuji
Lead Systems Analyst
Thoracic and Cardiovascular Surgery
Cleveland Clinic Foundation
9500 Euclid Avenue/ W26
Cleveland, Ohio 44195
Office: (216)444-8593
ogbujic@ccf.org

Received on Wednesday, 13 September 2006 16:44:21 UTC