- From: Nathan <nathan@webr3.org>
- Date: Thu, 04 Nov 2010 16:50:57 +0000
- To: RDFA Working Group <public-rdfa-wg@w3.org>
- CC: Ivan Herman <ivan@w3.org>, Manu Sporny <msporny@digitalbazaar.com>, Mark Birbeck <mark.birbeck@webbackplane.com>
Hi All, Today in the weekly telecon we discussed ISSUE-52, and specifically a general conceptual issue which needs addressed before we can proceed with the open issues on the RDFa API. The conceptual issue is whether to make a distinction between "RDF Graph" and "Data Store", and then determine which are required by the RDFa API. I think we can safely consider this issue a blocker wrt the RDFa API, so the sooner we can agree the better. To perhaps bring clarity to the issue, I'll assert that there are in fact three distinct concepts: - RDF Graph An interface representing a "set" of an RDF Triples, an RDF Graph as defined in the RDF Specifications - RDF Graph Store An interface which allows the storage and retrieval of several distinct RDF Graphs. - RDF Triple Store An interface which allows the storage and retrieval of RDF Triples, where the notion of distinct graphs has been discarded (and generally where provenance information has been removed). Any RDF Graph interface will be somewhat akin to an Array or Sequence of RDF Triples, and can roughly be specified as: interface RDFGraph = sequence<RDFTriple> (or RDFTriple[]) Any RDF Graph Store interface will be similar to a Dictionary or simple Key/Value store which stores "RDF Graphs" against a certain key, this is similar to QuadStores and the notion of Named Graphs, roughly this would be specified as: interface GraphStore { RDFGraph get (in DOMString key); void set (in DOMString key, in RDFGraph value); } However, and I believe this is where any "grey" conceptual areas have arisen in the existing RDFa API, an "RDF Triple Store" is very like an "RDF Graph", in fact it's almost identical other than the fact it's persistent in some way. interface TripleStore ~= RDFGraph So to begin addressing this issue, I'll first assert that the concepts of "RDF Graph" and "RDF Graph Store" are clearly distinct from each other, and that any RDFa or RDF API *requires* the concept of an "RDF Graph". Additionally, RDFa Core *requires* the RDFa API to /at least/ support two instances of RDF Graph, the "default graph" and the "processor graph", and to provide clear access to both. However, the RDFa API does not require the concept of an "RDF Graph Store", although clearly such a thing exists, and is required if one is to store multiple rdf graphs, keeping them distinct from one another, and is required when using such features as the "FROM" clause in SPARQL, thus we may be wise to mention or define it in some way. Similarly I think we can quickly assert that an "RDF Graph Store" is distinct from an "RDF Triple Store", one handles distinct sets of triples, graphs, the other handles a single set of triples, a single graph. Thus, from this point on I'll remove "RDF Graph Store" from this discussion. Remaining we have the slightly more complex "RDF Graph" vs "RDF Triple Store" distinction to make, where the grey area currently exists in the API. Whilst both an RDF Graph and an RDF Triple Store share many common features, both "contain" sets of RDF Triples, and both need to provide access to the triples, I believe that there are key distinctions we can make between the two. First, an RDF Triple Store is an interface to a Store, the store may hold triples in memory and in the same environment, and there may be multiple stores, but critically the stores may also be located in a different environment, on a different tier or on an entirely different machine all together - whereas an RDF Graph is an interface which simply represents a set of RDF Triples - it's the same distinction we make between an array and a database, they are quite different. However, we could also assert that an RDF Triple Store which we'd specify would be constrained to be in the same environment, in memory, and thus these distinguishing features would essentially be lost, so we have to look deeper. Next, an RDF Graph is in many ways (almost-) immutable, that is to say that an RDF Graph is a set of triples to which you can add more triples, but you cannot remove triples from, also the key methods on an RDF Graph are immutable, a filter() will return a new RDF Graph (which can be considered a subgraph of the first), and a "merge" method will be more like a concatenation of two (or more) graphs, returning a third new graph. Whereas an RDF Triple Store has no immutable characteristics, it requires methods to both add and remove triples, a "filter" is more like a "select", and a "merge" method is more of an "import". The important thing to note here is that an RDF Triple store requires that any merge/import method add new triples to the store, whereas any similar method on a graph could be defined either way. So, the key distinctions we have are that conceptually a store is persistent and potentially may contain triples from many graphs, whereas a graph just is a graph/set of triples - and, a store has functionality to remove triples, whereas a graph does not. The other thing to note is that if we ignore the concept of an RDF Graph and instead use a Triple Store (or as we termed it, Data Store), will somebody else have to define the interface for an RDF Graph at a later date? and will our definition of a Store (possibly with no remove methods!) suit common usage of Stores in the wild, or will somebody have to define a better/more suited interface for a Store? To summarise, we need to agree on the answers to the following questions: - Are an RDF Graph and an RDF Triple Store distinct? - Can we use an RDF Triple Store instead of an RDF Graph in the API? - Should we use an RDF Triple Store instead of an RDF Graph in the API? - Which of the three interfaces should we define as part of the RDFa API, [ RDF Graph, RDF Triple Store, RDF Graph Store ] - Which of the three interfaces might we define as part of a note? And after that rather lengthy email, here's my personal opinion: - Define an "RDF Graph" interface (aligned with Array in javascript) - Expose a property or method on the DataParser interface which gives access to the "processor graph" as required by RDFa Core. - Assert that we have two as-yet-undefined interfaces, "RDF Triple Store" and "RDF Graph Store" - Clear the issues and get the next editors draft of the RDFa API done. - If we have time, define one or both of the Triple Store and Graph Store interfaces, in a note. However, we really need to all agree and move forwards on this matter, Best, Nathan
Received on Thursday, 4 November 2010 16:52:06 UTC