Re: [Graphs] Proposal for Named Graph Semantics from Alex Hall on 2011-04-11 (public-rdf-wg@w3.org from April 2011)

From: Alex Hall <alexhall@revelytix.com>
Date: Mon, 11 Apr 2011 09:46:19 -0400
To: nathan@webr3.org
Cc: "Eric Prud'hommeaux" <eric@w3.org>, RDF WG <public-rdf-wg@w3.org>, Richard Cyganiak <richard@cyganiak.de>
Message-ID: <BANLkTi=4WCRiZ-zY29WEk7Cg8iy1Y+svPA@mail.gmail.com>
On Fri, Apr 8, 2011 at 6:07 PM, Nathan <nathan@webr3.org> wrote:

>
>> <snip/>
>> I suppose this is particularly relevant in the Linked Data community,
>> where
>> I might dereference the IRI  <I> = <http://example.com/people#Alice> and
>> find a graph G = { <http://example.com/people#Alice> foaf:givenName
>> "Alice";
>> foaf:knows :Bob; ... }  If we want to turn around and associate <I> with G
>> in a quad store or RDF dataset, we should be able to just do that without
>> implicating that the resource identified by <I> is that graph, or having
>> to
>> allocate some new IRI to represent the notion of "the graph that describes
>> Alice".
>>
>
> Note, you can't dereference an IRI which contains a fragment, only
> absolute-IRIs are dereferencable. (As in, you'd have to chop of the fragment
> above and dereference <http://example.com/people>).
>
>
>  That does raise a bunch of questions for me surrounding provenance (maybe
>> they've already been answered, I'm not familiar with research in that
>> area),
>> like how to differentiate the description of <I> as a web document (was
>> retrieved on this date, etc.) from the description of <I> as an abstract
>> resource (the person Alice, in this case)?  Is it worthwhile inventing a
>> vocabulary to define the notion of "the graph that describes Alice"?
>>
>
> RDF pretty much already has that, see
> http://www.w3.org/TR/rdf-concepts/#section-fragID
>
> [[
> we assume that the URI part (i.e. excluding fragment identifier) identifies
> a resource, which is presumed to have an RDF representation. So when
> eg:someurl#frag is used in an RDF document, eg:someurl is taken to designate
> some RDF document (even when no such document can be retrieved).
> ]]
>
> document, meaning "Information Resource" as opposed to a chunk of RDF/XML
> which you GET at a particular time.
>

Thanks for pointing this out.  This is my inexperience wrt Linked Data
showing through.  Should have caught that one...

-Alex



>
> Hence why we have the 303 status code guidance and httpRange-14 resolution
> to provide some way of allowing non fragment URIs to be dereferencable.
>
> That's a big issue though, which I'm pretty sure not many will appreciate
> me mentioning around here :p
>
> Back to the terminology though, this is where the g-snap, g-text, g-box
> terminology comes in handy, rather than saying "graph". Terms like Graph and
> Document have taken on multiple meanings for different people over the
> years, and are v much conflated / tend to lead people to inferring the wrong
> things when used innocently in conversation.
>
> Best,
>
> Nathan
>
>
>  These
>> tend to lead down a path that I would describe as graph reification, and
>> not
>> having done the appropriate research there I think I'll refrain from
>> further
>> comment...
>>
>> -Alex
>>
>>
>>
>>  Best,
>>>
>>
>>
>>  Nathan
>>>
>>>
>>>  Define G(I) as a function that returns the RDF graph identified by I.
>>>  In
>>>
>>>> our parlance, G(I) is a g-snap, invariant over time.  Due to the nature
>>>>> of
>>>>> RDF, it is difficult to express the relationship between I and G(I)
>>>>> natively
>>>>> in RDF.  Graph literals, which I understand to be the encoding of some
>>>>> set
>>>>> of triples as a single node in a graph, are one possible approach but
>>>>> this
>>>>> proposal does not attempt to define graph literals.  Furthermore, in
>>>>> the
>>>>> open world it's not possible to have complete knowledge of all the
>>>>> triples
>>>>> in G(I) for any given I.
>>>>>
>>>>>  In order to make explicit the difference between the mapping and the
>>>> graph name convention below, I'm using "GraphAt" for "G":
>>>>
>>>>  1a. GraphAt(I:IRI):RDFGraph = G ∣ RDFParse(HTTPGet(I)) == G
>>>>  1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G, m a local map
>>>>  1c. GraphAt(I:IRI):Boolean = G ∣ M(I) == G, M an RDF-wide map
>>>>
>>>> and define 0 in terms of 1:
>>>>
>>>>  0.  Graph(I:IRI):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) == G
>>>>
>>>>
>>>>  2. Graph Assertion
>>>>
>>>>> Let I be an IRI and G be an RDF graph.  Define GA(I, G) as a binary
>>>>> predicate such that GA(I, G) implies (a) Graph(I) and (b) G(I) entails
>>>>> G.
>>>>>
>>>>>   3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ Graph(I) &&
>>>> GraphAt(I)
>>>> ⊢ G
>>>>
>>>> or
>>>>
>>>>  3. GA(I:IRI, G:RDFGraph):Boolean = ∃ G:RDFGraph ∣ GraphAt(I) ⊢ G
>>>>
>>>>
>>>>  The notion of graph assertion attempts to capture the semantics of what
>>>>
>>>>> happens when some set of triples is associated with a graph IRI in a
>>>>> multi-graph serialization such as TriG.  So the TriG fragment:
>>>>>
>>>>> :G1 { :a :b :c } .
>>>>>
>>>>> would be understood to construct a graph G with a single triple :a :b
>>>>> :c
>>>>> and
>>>>> then make the assertion GA(:G1 G).
>>>>>
>>>>> The use of "entails" as opposed to "equals" here is what gives us our
>>>>> flexibility.  Applications that want to treat named graphs as g-snaps,
>>>>> completely described by the triples associated with the graph IRI, can
>>>>> do
>>>>> so
>>>>> by extending (b) to say G(I) equals G instead of entails.  Because
>>>>> every
>>>>> graph entails itself, this extension is supported by these semantics,
>>>>> but
>>>>> this would not be required behavior.  Indeed, this could lead to
>>>>> trouble
>>>>> in
>>>>> the open world where you can have GA(I, G1) and GA(I, G2) with G1 !=
>>>>> G2.
>>>>>
>>>>>  This might be awkward for tests cases 'cause
>>>>  <X> log:implies { <pi> <numericValue> 3.14, 3.0 . } . # the Bolslough
>>>> simplification
>>>> is a valid parse of
>>>>  { <pi> <numericValue> 3.14 . }
>>>> I wonder if there's some way to move this beyond parsing (no suggestions
>>>> yet).
>>>>
>>>>
>>>>  Applications that want to treat named graphs as g-boxes would to so by
>>>>
>>>>> essentially maintaining a (time-sensitive) mapping of IRI I to graph G.
>>>>>  This aligns pretty closely with my understanding of the notion of
>>>>> graph
>>>>> store from SPARQL 1.1 Update.  Poking the g-box to obtain content
>>>>> (either
>>>>> a
>>>>> g-text serialization or query results) amounts to asserting GA(I, G)
>>>>> for
>>>>> the
>>>>> current value of G at some point in time.  Given a new graph assertion
>>>>> for
>>>>> an IRI that is already mapped in the store, an implementation could
>>>>> replace
>>>>> the currently mapped graph with the new one (effectively discarding all
>>>>> prior graph assertions) or merge them at its discretion; either
>>>>> approach
>>>>> would be supported by these semantics.
>>>>>
>>>>>  I guess this argues for   1b. GraphAt(I:IRI):Boolean = G ∣ m(I) == G,
>>>> m a
>>>> local map
>>>> where we don't try to tell one person's GA to match another's.
>>>>
>>>>
>>>>  Any vocabulary for specifying graph literals and attaching them to a
>>>>
>>>>> graph
>>>>> IRI in RDF would be defined as making a graph assertion, not setting
>>>>> the
>>>>> value of the identified graph.
>>>>>
>>>>> 3. RDF Datasets
>>>>> I haven't thought this part through entirely, but I think these
>>>>> semantics
>>>>> could be aligned with the existing notion of RDF datasets from SPARQL
>>>>> (and
>>>>> as proposed on the wiki) by simply mapping the (IRI, graph) tuples in
>>>>> the
>>>>> dataset to the appropriate graph assertions.
>>>>>
>>>>> 4. Graph Equality
>>>>> Because it is not the case that (G1 entails G and G2 entails G) implies
>>>>> G1 =
>>>>> G2, it is also not the case that (GA(I1, G) and GA(I2, G)) implies I1
>>>>> and
>>>>> I2
>>>>> are the same graph.  Such a conclusion could be reached if you extend
>>>>> the
>>>>> definition of GA to mean equals instead of entails as discussed before,
>>>>> but
>>>>> again that is an extension and not part of the proposed semantics.
>>>>>
>>>>> 5. Empty Graphs
>>>>> Because every graph trivially entails the empty graph E, the assertion
>>>>> GA(I,
>>>>> E) is trivially true for every graph IRI I.  Making that assertion
>>>>> doesn't
>>>>> do anything beyond identify the resource denoted by I as a graph.
>>>>>
>>>>> 6. Graph Merges
>>>>> It follows from the definition of GA (and the definition of entails)
>>>>> that
>>>>> (GA(I, G1) and GA(I, G2)) implies GA(I, Merge(G1, G2)).  I think this
>>>>> gives
>>>>> us a pretty straightforward approach to merging of RDF datasets if this
>>>>> is
>>>>> required of the spec.
>>>>>
>>>>> Hope you find this useful...  or at least that this stirs up some
>>>>> interesting debate.
>>>>>
>>>>>  thanks for moving this forward.
>>>>
>>>>
>>>>  Regards,
>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>
>>
>
Received on Monday, 11 April 2011 13:46:47 UTC