Re: [Graphs] Proposal for Named Graph Semantics

* Alex Hall <alexhall@revelytix.com> [2011-04-07 13:05-0400]
> On Thu, Apr 7, 2011 at 10:28 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> > * Alex Hall <alexhall@revelytix.com> [2011-04-07 04:47-0400]
> > > Here is a proposal of a semantics for named graphs in RDF.  My goals here
> > > are to:
> > > - Extend
> > http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal to
> > > go beyond (IRI, graph) tuples.
> > > - Give something that is formally defined enough to serve as a starting
> > > point for discussion.
> > > - Specify common semantics for multi-graph serialization formats, or at
> > > least a starting point.
> > > - Specify something that is flexible enough to satisfy applications that
> > > want to treat named graphs as either g-snaps or g-boxes.
> > >
> > > Regarding g-boxes, I specifically want to avoid incorporating anything
> > that
> > > suggests time variance into the semantics, because specifying semantics
> > for
> > > temporal changes is explicitly out of scope for the existing RDF
> > Semantics
> > > document.
> > >
> > > Here goes...
> > >
> > > 1. Graph Identification
> > > Let I be an IRI.  Define Graph(I) as a unary predicate such that Graph(I)
> > > implies that the resource identified by I is an RDF graph.  If desired,
> > this
> > > can be described easily enough in RDF by defining a new class rdfs:Graph
> > and
> > > mapping Graph(I) to the triple I rdf:type rdfs:Graph.
> >
> > Perhaps you didn't intend to take a position on this, but does
> > "resource identified by I" imply http dererencing? I can think of
> > three interpretations of Graph:
> >
> >  0a. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ RDFParse(HTTPGet(I)) == G
> >  0b. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ m(I) == G, m a local map
> >  0c. Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ M(I) == G, M an RDF-wide map
> >
> 
> You're correct that I did not intend to take a position on HTTP
> dereferencing.  My personal preference is against including that as part of
> the spec.  It would be easy enough to specify HTTP dereferencing in 0a. as
> an extension of the general mapping in 0b. by simply defining m(I) =
> RDFParse(HTTPGet(I)).
> 
> I don't have a strong preference between 0b. and 0c. at the moment, so I'll
> defer taking a position on this.

My guess is that between b and c, b is the only tenable position. c
would imply some centralized resolution, which may as well lean on the
protocol part of the URL.


> > Of course these definitions presume a position on whether two graphs
> > with the same triples == each other. Perhaps "I is an RDF graph" says
> > to use "is" and not "==".
> >
> 
> Specifying graphs as g-snaps seems to imply that two graphs with the same
> triples == each other -- as expressed in [
> http://www.w3.org/2011/rdf-wg/wiki/GraphConceptTerminology]: "If two g-snaps
> have the same nodes/arcs, they are really the same g-snap."  I'll admit that
> I haven't fully thought through what the implications of this would be.
>  Given a notion of RDF node equality and a global mapping M as in 0c. I
> suppose you could infer I1 = I2 based on M(I1) == M(I2).

That looks like the reverse function to me, saying that
  <http://foo.example/> { <a> <b> <c> }
  <http://bar.example/> { <a> <b> <c> }
tells me that
  <http://foo.example/> = <http://bar.example/>

In the SPARQL WG, folks said they didn't want to *have to* dereference
HTTP URIs to serve SPARQL queries. I guess the local mapping will be
entirely up to the tool, but that for GET'able protocols, http:, ftp:,
file:, the user's expectation will be some graph corresponding to the
GET of that resource.


> > > Define G(I) as a function that returns the RDF graph identified by I.  In
> > > our parlance, G(I) is a g-snap, invariant over time.  Due to the nature
> > of
> > > RDF, it is difficult to express the relationship between I and G(I)
> > natively
> > > in RDF.  Graph literals, which I understand to be the encoding of some
> > set
> > > of triples as a single node in a graph, are one possible approach but
> > this
> > > proposal does not attempt to define graph literals.  Furthermore, in the
> > > open world it's not possible to have complete knowledge of all the
> > triples
> > > in G(I) for any given I.
> >
> > In order to make explicit the difference between the mapping and the
> > graph name convention below, I'm using "GraphAt" for "G":
> >
> >  1a. GraphAt(I:IRI):RDFGraph = G ∣ RDFParse(HTTPGet(I)) == G
> >  1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G, m a local map
> >  1c. GraphAt(I:IRI):Boolean = G ∣ M(I) == G, M an RDF-wide map
> >
> 
> Should 1b and 1c read "GraphAt(I:IRI):RDFGraph ..." (rather than Boolean)?

yup


> > and define 0 in terms of 1:
> >
> >  0.  Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) == G
> >
> >
> > > 2. Graph Assertion
> > > Let I be an IRI and G be an RDF graph.  Define GA(I, G) as a binary
> > > predicate such that GA(I, G) implies (a) Graph(I) and (b) G(I) entails G.
> >
> >   3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ Graph(I) && GraphAt(I)
> > ⊢ G
> >
> > or
> >
> >  3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) ⊢ G
> >
> 
> Is the existential quantifier necessary here?  Isn't G bound by GA(I:IRI,
> G:RDFGraph)?

quite right. if we can use ⊢ as a boolean predicate,

  3. GA(I:IRI, G:RDFGraph):Boolean = GraphAt(I) ⊢ G


> > > The notion of graph assertion attempts to capture the semantics of what
> > > happens when some set of triples is associated with a graph IRI in a
> > > multi-graph serialization such as TriG.  So the TriG fragment:
> > >
> > > :G1 { :a :b :c } .
> > >
> > > would be understood to construct a graph G with a single triple :a :b :c
> > and
> > > then make the assertion GA(:G1 G).
> > >
> > > The use of "entails" as opposed to "equals" here is what gives us our
> > > flexibility.  Applications that want to treat named graphs as g-snaps,
> > > completely described by the triples associated with the graph IRI, can do
> > so
> > > by extending (b) to say G(I) equals G instead of entails.  Because every
> > > graph entails itself, this extension is supported by these semantics, but
> > > this would not be required behavior.  Indeed, this could lead to trouble
> > in
> > > the open world where you can have GA(I, G1) and GA(I, G2) with G1 != G2.
> >
> > This might be awkward for tests cases 'cause
> >  <X> log:implies { <pi> <numericValue> 3.14, 3.0 . } . # the Bolslough
> > simplification
> > is a valid parse of
> >  { <pi> <numericValue> 3.14 . }
> > I wonder if there's some way to move this beyond parsing (no suggestions
> > yet).
> >
> >
> > > Applications that want to treat named graphs as g-boxes would to so by
> > > essentially maintaining a (time-sensitive) mapping of IRI I to graph G.
> > >  This aligns pretty closely with my understanding of the notion of graph
> > > store from SPARQL 1.1 Update.  Poking the g-box to obtain content (either
> > a
> > > g-text serialization or query results) amounts to asserting GA(I, G) for
> > the
> > > current value of G at some point in time.  Given a new graph assertion
> > for
> > > an IRI that is already mapped in the store, an implementation could
> > replace
> > > the currently mapped graph with the new one (effectively discarding all
> > > prior graph assertions) or merge them at its discretion; either approach
> > > would be supported by these semantics.
> >
> > I guess this argues for
> >  1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G, m a local map
> > where we don't try to tell one person's GA to match another's.
> >
> 
> I see your point here, although again I'll defer taking a position on the
> matter.  Could we write that:
> 
> 1b. GraphAt(I:IRI):Boolean = G ∣ ∃ m:GraphMap | m(I) == G
> 
> Is this correct?  Does this just confuse the matter?

hmm, it does seem to be a necessary parameter, like

  GraphAt(I:IRI, m:GraphMap):Boolean = G | m(I) == G

(or maybe
   m.GraphAt(I:IRI):Boolean = G | m(I) == G
 but that seems like unwarranted notation)


> BTW, thanks for introducing the logical notation.  This is my first stab at
> writing such a document, I haven't yet located those keys on my keyboard :-)

Maybe they're way out there beyond the '~' somewhere, but I just copy
them from
  http://www.fileformat.info/info/unicode/block/mathematical_operators/utf8test.htm

-- 
-ericP

Received on Thursday, 7 April 2011 18:20:56 UTC