Re: [Graphs] Proposal for Named Graph Semantics from Alex Hall on 2011-04-08 (public-rdf-wg@w3.org from April 2011)

From: Alex Hall <alexhall@revelytix.com>
Date: Fri, 8 Apr 2011 10:48:53 -0400
To: antoine.zimmermann@insa-lyon.fr
Cc: public-rdf-wg@w3.org
Message-ID: <BANLkTimTXoqAdo+A=NGWrLWFy9dJQAjmfQ@mail.gmail.com>
On Fri, Apr 8, 2011 at 10:22 AM, Antoine Zimmermann <
antoine.zimmermann@insa-lyon.fr> wrote:

> I propose something that goes very much in the same direction as Alex's
> proposal:
>
> Definition(Graph map) A graph map GM is a partial function from the set of
> IRIs to the set of g-snaps.
>
> Definition(Temporal graph map) A temporal graph map TGM is a partial
> function from [-inf,+inf] which maps a time point to a graph map.
>
> There exists a special temporal graph map, called the HTTPmap, such that at
> any given point in time t, HTTPmap(t) maps a URI to the parsed RDF graph of
> the document retrieved via an HTTP GET of the URI at the time t. (if HTTP
> GET does not provide an RDF serialisation, then the mapping is not defined
> on that URI).
>
> A graph map is application-specific and can be static or temporal. If an
> application implements a versioning system, it is possible to use a temporal
> graph map with a time parameter in the past. However, the default behaviour
> should be to use a temporal graph map with the current time.
>
> Now, to implement a graph map different from the HTTPmap, one way is to use
> a format like TriG or Quads where the mapping is syntactically made explicit
> in a file:
>
> :G1 {:x :y :z}
> :G2 {:a :b :c}
>
> may be used to say that GM(:G1) = {:x :y :z} and GM(:G2) = {:a :b :c}.
>
> Now, the connection with Dataset and its semantics is as follows: a set of
> URIs (u1, ..., un) and an optional default g-snap G induce a dataset (G,
> <u1,GM(u1)>, ... , <un,GM(un)>).
>
> The rest of the semantics is as in the current Wikipage.
>
> Basically, this says that a dataset is a snapshot of the data you can get
> by looking up certain URI ("looking up" not necessarily meaning "using
> HTTP"). The semantics in the wikipage just says that each graph in the
> dataset are interpreted independently (this can be further constrained by
> semantic extensions, just like RDFS constrains further the RDF
> interpretations).
>

I'd say this is a different direction from my proposal, the fundamental
difference being that my graphs and graph map are invariant over time and
that the presence of an <IRI,G> pair in a dataset is making an assertion as
to the (partial) content of the graph mapped by that IRI.  The reason I say
partial is that in the open world, we can never assume to have a full
description of any resource, and I extend that to include graphs named with
IRIs.

You seem to be treating your map as a mutable data structure in some
programming language, from which we can retrieve graphs.  I'm not saying
that's wrong, and from a practical standpoint e.g. for answering queries,
there's probably no difference.

Not trying to make a judgment here, just pointing out how they are different
approaches.

-Alex



>
>
> AZ.
>
> Le 07/04/2011 10:47, Alex Hall a écrit :
>
>  Here is a proposal of a semantics for named graphs in RDF.  My goals here
>> are to:
>> - Extend
>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal to
>> go beyond (IRI, graph) tuples.
>> - Give something that is formally defined enough to serve as a starting
>> point for discussion.
>> - Specify common semantics for multi-graph serialization formats, or at
>> least a starting point.
>> - Specify something that is flexible enough to satisfy applications that
>> want to treat named graphs as either g-snaps or g-boxes.
>>
>> Regarding g-boxes, I specifically want to avoid incorporating anything
>> that
>> suggests time variance into the semantics, because specifying semantics
>> for
>> temporal changes is explicitly out of scope for the existing RDF Semantics
>> document.
>>
>> Here goes...
>>
>> 1. Graph Identification
>> Let I be an IRI.  Define Graph(I) as a unary predicate such that Graph(I)
>> implies that the resource identified by I is an RDF graph.  If desired,
>> this
>> can be described easily enough in RDF by defining a new class rdfs:Graph
>> and
>> mapping Graph(I) to the triple I rdf:type rdfs:Graph.
>>
>> Define G(I) as a function that returns the RDF graph identified by I.  In
>> our parlance, G(I) is a g-snap, invariant over time.  Due to the nature of
>> RDF, it is difficult to express the relationship between I and G(I)
>> natively
>> in RDF.  Graph literals, which I understand to be the encoding of some set
>> of triples as a single node in a graph, are one possible approach but this
>> proposal does not attempt to define graph literals.  Furthermore, in the
>> open world it's not possible to have complete knowledge of all the triples
>> in G(I) for any given I.
>>
>> 2. Graph Assertion
>> Let I be an IRI and G be an RDF graph.  Define GA(I, G) as a binary
>> predicate such that GA(I, G) implies (a) Graph(I) and (b) G(I) entails G.
>>
>> The notion of graph assertion attempts to capture the semantics of what
>> happens when some set of triples is associated with a graph IRI in a
>> multi-graph serialization such as TriG.  So the TriG fragment:
>>
>> :G1 { :a :b :c } .
>>
>> would be understood to construct a graph G with a single triple :a :b :c
>> and
>> then make the assertion GA(:G1 G).
>>
>> The use of "entails" as opposed to "equals" here is what gives us our
>> flexibility.  Applications that want to treat named graphs as g-snaps,
>> completely described by the triples associated with the graph IRI, can do
>> so
>> by extending (b) to say G(I) equals G instead of entails.  Because every
>> graph entails itself, this extension is supported by these semantics, but
>> this would not be required behavior.  Indeed, this could lead to trouble
>> in
>> the open world where you can have GA(I, G1) and GA(I, G2) with G1 != G2.
>>
>> Applications that want to treat named graphs as g-boxes would to so by
>> essentially maintaining a (time-sensitive) mapping of IRI I to graph G.
>>  This aligns pretty closely with my understanding of the notion of graph
>> store from SPARQL 1.1 Update.  Poking the g-box to obtain content (either
>> a
>> g-text serialization or query results) amounts to asserting GA(I, G) for
>> the
>> current value of G at some point in time.  Given a new graph assertion for
>> an IRI that is already mapped in the store, an implementation could
>> replace
>> the currently mapped graph with the new one (effectively discarding all
>> prior graph assertions) or merge them at its discretion; either approach
>> would be supported by these semantics.
>>
>> Any vocabulary for specifying graph literals and attaching them to a graph
>> IRI in RDF would be defined as making a graph assertion, not setting the
>> value of the identified graph.
>>
>> 3. RDF Datasets
>> I haven't thought this part through entirely, but I think these semantics
>> could be aligned with the existing notion of RDF datasets from SPARQL (and
>> as proposed on the wiki) by simply mapping the (IRI, graph) tuples in the
>> dataset to the appropriate graph assertions.
>>
>> 4. Graph Equality
>> Because it is not the case that (G1 entails G and G2 entails G) implies G1
>> =
>> G2, it is also not the case that (GA(I1, G) and GA(I2, G)) implies I1 and
>> I2
>> are the same graph.  Such a conclusion could be reached if you extend the
>> definition of GA to mean equals instead of entails as discussed before,
>> but
>> again that is an extension and not part of the proposed semantics.
>>
>> 5. Empty Graphs
>> Because every graph trivially entails the empty graph E, the assertion
>> GA(I,
>> E) is trivially true for every graph IRI I.  Making that assertion doesn't
>> do anything beyond identify the resource denoted by I as a graph.
>>
>> 6. Graph Merges
>> It follows from the definition of GA (and the definition of entails) that
>> (GA(I, G1) and GA(I, G2)) implies GA(I, Merge(G1, G2)).  I think this
>> gives
>> us a pretty straightforward approach to merging of RDF datasets if this is
>> required of the spec.
>>
>> Hope you find this useful...  or at least that this stirs up some
>> interesting debate.
>>
>> Regards,
>> Alex
>>
>>
>
> --
> Antoine Zimmermann
> Researcher at:
> Laboratoire d'InfoRmatique en Image et Systèmes d'information
> Database Group
> 7 Avenue Jean Capelle
> 69621 Villeurbanne Cedex
> France
> Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
> Lecturer at:
> Institut National des Sciences Appliquées de Lyon
> 20 Avenue Albert Einstein
> 69621 Villeurbanne Cedex
> France
> antoine.zimmermann@insa-lyon.fr
> http://zimmer.aprilfoolsreview.com/
>
>
Received on Friday, 8 April 2011 14:49:22 UTC