- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Thu, 7 Apr 2011 10:28:18 -0400
- To: Alex Hall <alexhall@revelytix.com>
- Cc: RDF WG <public-rdf-wg@w3.org>
* Alex Hall <alexhall@revelytix.com> [2011-04-07 04:47-0400] > Here is a proposal of a semantics for named graphs in RDF. My goals here > are to: > - Extend http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal to > go beyond (IRI, graph) tuples. > - Give something that is formally defined enough to serve as a starting > point for discussion. > - Specify common semantics for multi-graph serialization formats, or at > least a starting point. > - Specify something that is flexible enough to satisfy applications that > want to treat named graphs as either g-snaps or g-boxes. > > Regarding g-boxes, I specifically want to avoid incorporating anything that > suggests time variance into the semantics, because specifying semantics for > temporal changes is explicitly out of scope for the existing RDF Semantics > document. > > Here goes... > > 1. Graph Identification > Let I be an IRI. Define Graph(I) as a unary predicate such that Graph(I) > implies that the resource identified by I is an RDF graph. If desired, this > can be described easily enough in RDF by defining a new class rdfs:Graph and > mapping Graph(I) to the triple I rdf:type rdfs:Graph. Perhaps you didn't intend to take a position on this, but does "resource identified by I" imply http dererencing? I can think of three interpretations of Graph: 0a. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ RDFParse(HTTPGet(I)) == G 0b. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ m(I) == G, m a local map 0c. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ M(I) == G, M an RDF-wide map Of course these definitions presume a position on whether two graphs with the same triples == each other. Perhaps "I is an RDF graph" says to use "is" and not "==". > Define G(I) as a function that returns the RDF graph identified by I. In > our parlance, G(I) is a g-snap, invariant over time. Due to the nature of > RDF, it is difficult to express the relationship between I and G(I) natively > in RDF. Graph literals, which I understand to be the encoding of some set > of triples as a single node in a graph, are one possible approach but this > proposal does not attempt to define graph literals. Furthermore, in the > open world it's not possible to have complete knowledge of all the triples > in G(I) for any given I. In order to make explicit the difference between the mapping and the graph name convention below, I'm using "GraphAt" for "G": 1a. GraphAt(I:IRI):RDFGraph = G ∣ RDFParse(HTTPGet(I)) == G 1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G, m a local map 1c. GraphAt(I:IRI):Boolean = G ∣ M(I) == G, M an RDF-wide map and define 0 in terms of 1: 0. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) == G > 2. Graph Assertion > Let I be an IRI and G be an RDF graph. Define GA(I, G) as a binary > predicate such that GA(I, G) implies (a) Graph(I) and (b) G(I) entails G. 3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ Graph(I) && GraphAt(I) ⊢ G or 3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) ⊢ G > The notion of graph assertion attempts to capture the semantics of what > happens when some set of triples is associated with a graph IRI in a > multi-graph serialization such as TriG. So the TriG fragment: > > :G1 { :a :b :c } . > > would be understood to construct a graph G with a single triple :a :b :c and > then make the assertion GA(:G1 G). > > The use of "entails" as opposed to "equals" here is what gives us our > flexibility. Applications that want to treat named graphs as g-snaps, > completely described by the triples associated with the graph IRI, can do so > by extending (b) to say G(I) equals G instead of entails. Because every > graph entails itself, this extension is supported by these semantics, but > this would not be required behavior. Indeed, this could lead to trouble in > the open world where you can have GA(I, G1) and GA(I, G2) with G1 != G2. This might be awkward for tests cases 'cause <X> log:implies { <pi> <numericValue> 3.14, 3.0 . } . # the Bolslough simplification is a valid parse of { <pi> <numericValue> 3.14 . } I wonder if there's some way to move this beyond parsing (no suggestions yet). > Applications that want to treat named graphs as g-boxes would to so by > essentially maintaining a (time-sensitive) mapping of IRI I to graph G. > This aligns pretty closely with my understanding of the notion of graph > store from SPARQL 1.1 Update. Poking the g-box to obtain content (either a > g-text serialization or query results) amounts to asserting GA(I, G) for the > current value of G at some point in time. Given a new graph assertion for > an IRI that is already mapped in the store, an implementation could replace > the currently mapped graph with the new one (effectively discarding all > prior graph assertions) or merge them at its discretion; either approach > would be supported by these semantics. I guess this argues for 1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G, m a local map where we don't try to tell one person's GA to match another's. > Any vocabulary for specifying graph literals and attaching them to a graph > IRI in RDF would be defined as making a graph assertion, not setting the > value of the identified graph. > > 3. RDF Datasets > I haven't thought this part through entirely, but I think these semantics > could be aligned with the existing notion of RDF datasets from SPARQL (and > as proposed on the wiki) by simply mapping the (IRI, graph) tuples in the > dataset to the appropriate graph assertions. > > 4. Graph Equality > Because it is not the case that (G1 entails G and G2 entails G) implies G1 = > G2, it is also not the case that (GA(I1, G) and GA(I2, G)) implies I1 and I2 > are the same graph. Such a conclusion could be reached if you extend the > definition of GA to mean equals instead of entails as discussed before, but > again that is an extension and not part of the proposed semantics. > > 5. Empty Graphs > Because every graph trivially entails the empty graph E, the assertion GA(I, > E) is trivially true for every graph IRI I. Making that assertion doesn't > do anything beyond identify the resource denoted by I as a graph. > > 6. Graph Merges > It follows from the definition of GA (and the definition of entails) that > (GA(I, G1) and GA(I, G2)) implies GA(I, Merge(G1, G2)). I think this gives > us a pretty straightforward approach to merging of RDF datasets if this is > required of the spec. > > Hope you find this useful... or at least that this stirs up some > interesting debate. thanks for moving this forward. > Regards, > Alex -- -ericP
Received on Thursday, 7 April 2011 14:28:50 UTC