- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 27 Apr 2012 13:40:20 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
On Fri, 2012-04-27 at 14:15 +0100, Andy Seaborne wrote: ... > >> This is a strong argument for a two strand approach: ... > > Agreed, with the caveat that "minimal" may (and probably does) include > > going a bit beyond what everyone considers "safe" and "tested" as of > > today. > > Could you expand on that...? ... You touched on this a few times > in different places but I'd find it useful to have a consolidated view > from you. Yes. Here's the complete design, below. I'll call it "6.3". I think partial-graph semantics, which the group seems to prefer, are much more like quads, so I formulated it in those terms. I think it came out pretty nicely. Rather than argue now why each of these elements is necessary, I'll wait and see if there are any bits you think we should put off to Part 2. -- Sandro ======== 1. An RDF Dataset is a set of Dataset Entries, where each Dataset Entry is either an RDF Triple or an RDF "Quad". An RDF Quads is formed by pairing an RDF Triple with another RDF Term, called the Quad's "graph label" (or just "label"). The label is an RDF blank node or an RDF IRI-labeled node. The set of Triples (not in Quads) in the dataset is called the dataset's "default graph". The set of Triples used in quads with a particular label in a dataset is called the "named graph" associated with that label. The set of triples which are in the default graph or in any named graph is called the "union graph". Comments: I believe this definition is formally equivalent to the SPARQL definitions and the one in our draft, except (1) some minor terminology, (2) allowing blank nodes as graph labels, and (3) allowing blank nodes to be shared between the graphs. I'm not attached to this formulation; I just needed some way to convey how blank nodes can be shared, and after experimenting a bit, quads seemed like the best way to think about it. I expect the idea of allowing blank nodes to be used as graph labels to be controversial, but I think it's important for convenience and to clarify the semantics in the face of possible dereference operations. I understand it presents some issues, including SPARQL compatibility. I propose we consider this AT RISK through CR and see how those issues pan out. 2. Any dataset can be serialized in TriG, N-Quads, or potentially other languages. For example, the TriG Document: { <a> <b> <c> } <g1> { <a> <b> <c>, <d> } _:x { _:x <b> 1 } is a serialization of the same dataset as the N-Quads document: <a> <b> <c>. <a> <b> <c> <g1>. <a> <b> <d> <g1>. _:x <b> "1"^^<http://www.w3.org/2001/XMLSchema#integer> _:x I propose we issue specs for both TriG and N-Quads to help clarify what is syntax and what is semantics, and because people seem to like both formats. 3. Datasets have truth values, like RDF Graphs. A dataset may be said to "hold" or to be "true". Within a system (or potentially on the open Web) a dataset may be "asserted", and there may be logical consequences from this. Datasets may entail each other, much as RDF Graphs may entail each other, and may be logically consistent or inconsistent, much as RDF Graphs may be logically consistent or inconsistent. 4. A Dataset is true if and only if (1) its default graph is true (according to the normal RDF semantics) and (2) all of its quads are true. A quad is true if and only if (1) its label denotes something (a "graph resource") which can conceptually "contain" RDF Triples, and (2) that graph resource conceptually "contains" at least the quad's Triple. 5. We do not define standard types of graph resources, leaving this open for research and future standards work. These types can be defined so they constrain what it means for a triple to be "contained" by a graph resource of this type. For example, one could define these classes: eg:Graph - the class of RDF Graphs. For the dataset semantics, a Triple is "contained" in exactly those cases where it is in the graph, as per RDF Semantics. From this definition, it follows that being "contained" cannot change over time, and two graph resources which are of type eg:Graph and are known to contain exactly the same triples are in fact the same graph resource. eg:Feed - the class of Web pages which serve only RDF and are updated to reflect changing circumstances. For the dataset semantics, a Triple is "contained" if, in some time window, all successful dereferences of the Feed's URL produce a serialization of an RDF Graph which contains the triple. A dataset using graph resources which are instance of eg:Feed would be time dependent, much like a FOAF file which uses the foaf:age predicate is time dependent. ====== That's it. (Unless I've forgotten something....)
Received on Friday, 27 April 2012 17:40:39 UTC