Re: [Graphs] Proposal for Named Graph Semantics from Alex Hall on 2011-04-07 (public-rdf-wg@w3.org from April 2011)

From: Alex Hall <alexhall@revelytix.com>
Date: Thu, 7 Apr 2011 16:56:37 -0400
To: Richard Cyganiak <richard@cyganiak.de>
Cc: RDF WG <public-rdf-wg@w3.org>
Message-ID: <BANLkTi=ZaR6eEC+adz0YBStY5JCr6bBtsQ@mail.gmail.com>
On Thu, Apr 7, 2011 at 4:22 PM, Richard Cyganiak <richard@cyganiak.de>wrote:

> Thanks Alex.
>
> Trying to paraphrase: The RDF-Datasets-Proposal is just an abstract syntax
> and leaves open the question what the relationship between IRI and graph in
> the (IRI, graph) pair is. You are unsatisfied with that and would like to
> see a formal semantics specified for that relationship.
>

I would reword that last sentence as: I am uncomfortable with that and have
been trying to formulate a formal semantics that expresses my understanding
of that relationship.  I am not entirely convinced myself that this needs to
be part of the official specification, as clearly you are not either.  But
there's always been a certain amount of hand-waving around the area of named
graphs that engenders skepticism in some people.


>
> My next questions:
>
> Why do you want a formal semantics?
>

In short, to help me sleep better at night.  And if that's all anybody else
finds this useful for, then by all means leave it out of the spec.


>
> What is the benefit of having that?
>
> How does it help addressing the multigraph use cases [1]?
>

Taking a quick look through that doc, I'll admit that for most of the use
cases it doesn't address any issues that aren't addressed with the use of
the abstract syntax.  There is, however, a section titled "Providing a
standard foundation for the W3C specs" [2] and a sub-section on the
alignment of Linked Data principles with RDF and AWWW which specifically
states:

'A formal definition of a concept such as "RDF Dataset", "Set of Named
Graphs", and the g-box/g-snap/g-text distinction in a core RDF spec would
make it easier to formally define such a model, tying together the Linked
Data principles and practices, Architecture of the World Wide Web, and the
REST model of information resources and representations, in a formal way.'

I make no claims that this proposal advances that goal in any particular
way, I just highlight that section as an illustration that at least some
people seem to want a more formal semantics.

>
>
> Do you feel that delivering a solution that “merely” addresses the use
> cases would be insufficient?
>

Not at all -- I think an ideal solution addresses exactly those use cases
and nothing more.


>
> If that's the case, then can you articulate these extra requirements?
>
> (Please don't take this wrong, I'm trying to get unspoken assumptions out
> into the open.)
>

No worries.  This is a conversation worth having.


>
> Thanks,
> Richard
>
> [1] http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC
>
>
-Alex

[2]
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC#Providing_a_standard_foundation_for_W3C_specs


>
>
> On 7 Apr 2011, at 21:05, Alex Hall wrote:
>
> > On Thu, Apr 7, 2011 at 3:33 PM, Richard Cyganiak <richard@cyganiak.de>
> wrote:
> > Hi Alex,
> >
> > Thanks, that's quite useful. Before commenting on it, I'd like to ask
> though what the motivation is.
> >
> > The primary motivation was to attempt to give a coherent answer to the
> questions, "what exactly *is* the resource identified by a graph IRI?" and
> "what is the relationship between that resource and the <IRI, Graph> tuples
> expressed in an RDF dataset/TriG file?"  I get the sense that I'm not the
> only one who's struggled with these questions.  Maybe they're questions that
> the WG will not attempt to answer, but if we do then hopefully the WG will
> find this proposal useful.  At the very least, I think this proposal does a
> good job of expressing the world as I understand it and committing it to
> document form has help me clarify my thoughts on the matter.
> >
> >
> > On 7 Apr 2011, at 09:47, Alex Hall wrote:
> > > Here is a proposal of a semantics for named graphs in RDF.  My goals
> here are to:
> > > - Extend
> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal to go
> beyond (IRI, graph) tuples.
> >
> > Why do you want to go beyond (IRI, graph) tuples? What's the advantage?
> >
> > The RDF Datasets proposal explicitly avoids making claims about the
> relationship between the IRI and the graph.
> >
> >
> > > - Give something that is formally defined enough to serve as a starting
> point for discussion.
> >
> > Do you find that the RDF-Datasets-Proposal isn't sufficiently formal to
> serve as a starting point?
> >
> > Formal enough as an abstract syntax for capturing a multi-graph snapshot
> (a multi-g-snap, if you will).  But I don't see it going very far beyond
> that.
> >
> >
> > > - Specify common semantics for multi-graph serialization formats, or at
> least a starting point.
> >
> > Do you find that the RDF-Datasets-Proposal isn't sufficient for that?
> >
> > > - Specify something that is flexible enough to satisfy applications
> that want to treat named graphs as either g-snaps or g-boxes.
> >
> > In the simplest form of the RDF-Datasets-Proposal, one could either leave
> that open, or see (IRI, g-snap) pairs as snapshots of an assumed underlying
> named g-box that isn't explicitly modeled. So I'd say that the
> RDF-Datasets-Proposal is flexible enough to satisfy applications that want
> to tread named graphs as either g-snaps or g-boxes.
> >
> > So again, I don't understand the advantage that this proposal offers over
> the simpler (IRI, graph) pair proposal.
> >
> > Again, I see the RDF Datasets proposal as essentially an abstract syntax
> for expressing multi-graph snapshots, with little to say about the
> underlying semantics.  There is something to be said for not over-specifying
> the semantics, but at the time the RDF Datasets proposal was put forth I got
> the impression that people in the WG thought the semantics were
> under-specified.  This proposal is an attempt to capture the semantics of
> what is named in a named graph, if indeed that is an appropriate question to
> answer.
> >
> > -Alex
> >
> >
> > Can you speak to that?
> >
> > Best,
> > Richard
> >
> >
> >
> > >
> > > Regarding g-boxes, I specifically want to avoid incorporating anything
> that suggests time variance into the semantics, because specifying semantics
> for temporal changes is explicitly out of scope for the existing RDF
> Semantics document.
> > >
> > > Here goes...
> > >
> > > 1. Graph Identification
> > > Let I be an IRI.  Define Graph(I) as a unary predicate such that
> Graph(I) implies that the resource identified by I is an RDF graph.  If
> desired, this can be described easily enough in RDF by defining a new class
> rdfs:Graph and mapping Graph(I) to the triple I rdf:type rdfs:Graph.
> > >
> > > Define G(I) as a function that returns the RDF graph identified by I.
>  In our parlance, G(I) is a g-snap, invariant over time.  Due to the nature
> of RDF, it is difficult to express the relationship between I and G(I)
> natively in RDF.  Graph literals, which I understand to be the encoding of
> some set of triples as a single node in a graph, are one possible approach
> but this proposal does not attempt to define graph literals.  Furthermore,
> in the open world it's not possible to have complete knowledge of all the
> triples in G(I) for any given I.
> > >
> > > 2. Graph Assertion
> > > Let I be an IRI and G be an RDF graph.  Define GA(I, G) as a binary
> predicate such that GA(I, G) implies (a) Graph(I) and (b) G(I) entails G.
> > >
> > > The notion of graph assertion attempts to capture the semantics of what
> happens when some set of triples is associated with a graph IRI in a
> multi-graph serialization such as TriG.  So the TriG fragment:
> > >
> > > :G1 { :a :b :c } .
> > >
> > > would be understood to construct a graph G with a single triple :a :b
> :c and then make the assertion GA(:G1 G).
> > >
> > > The use of "entails" as opposed to "equals" here is what gives us our
> flexibility.  Applications that want to treat named graphs as g-snaps,
> completely described by the triples associated with the graph IRI, can do so
> by extending (b) to say G(I) equals G instead of entails.  Because every
> graph entails itself, this extension is supported by these semantics, but
> this would not be required behavior.  Indeed, this could lead to trouble in
> the open world where you can have GA(I, G1) and GA(I, G2) with G1 != G2.
> > >
> > > Applications that want to treat named graphs as g-boxes would to so by
> essentially maintaining a (time-sensitive) mapping of IRI I to graph G.
>  This aligns pretty closely with my understanding of the notion of graph
> store from SPARQL 1.1 Update.  Poking the g-box to obtain content (either a
> g-text serialization or query results) amounts to asserting GA(I, G) for the
> current value of G at some point in time.  Given a new graph assertion for
> an IRI that is already mapped in the store, an implementation could replace
> the currently mapped graph with the new one (effectively discarding all
> prior graph assertions) or merge them at its discretion; either approach
> would be supported by these semantics.
> > >
> > > Any vocabulary for specifying graph literals and attaching them to a
> graph IRI in RDF would be defined as making a graph assertion, not setting
> the value of the identified graph.
> > >
> > > 3. RDF Datasets
> > > I haven't thought this part through entirely, but I think these
> semantics could be aligned with the existing notion of RDF datasets from
> SPARQL (and as proposed on the wiki) by simply mapping the (IRI, graph)
> tuples in the dataset to the appropriate graph assertions.
> > >
> > > 4. Graph Equality
> > > Because it is not the case that (G1 entails G and G2 entails G) implies
> G1 = G2, it is also not the case that (GA(I1, G) and GA(I2, G)) implies I1
> and I2 are the same graph.  Such a conclusion could be reached if you extend
> the definition of GA to mean equals instead of entails as discussed before,
> but again that is an extension and not part of the proposed semantics.
> > >
> > > 5. Empty Graphs
> > > Because every graph trivially entails the empty graph E, the assertion
> GA(I, E) is trivially true for every graph IRI I.  Making that assertion
> doesn't do anything beyond identify the resource denoted by I as a graph.
> > >
> > > 6. Graph Merges
> > > It follows from the definition of GA (and the definition of entails)
> that (GA(I, G1) and GA(I, G2)) implies GA(I, Merge(G1, G2)).  I think this
> gives us a pretty straightforward approach to merging of RDF datasets if
> this is required of the spec.
> > >
> > > Hope you find this useful...  or at least that this stirs up some
> interesting debate.
> > >
> > > Regards,
> > > Alex
> > >
> >
> >
>
>
Received on Thursday, 7 April 2011 20:57:06 UTC