W3C home > Mailing lists > Public > public-rdf-star@w3.org > December 2020

Re: RDF* vs RDF vs named graphs

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Thu, 3 Dec 2020 10:50:14 +0100
To: Olaf Hartig <olaf.hartig@liu.se>, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Cc: public-rdf-star@w3.org, thomas lörtsch <tl@rat.io>
Message-ID: <87d7fa82-6a2e-89cf-5422-90a80ff1d35e@emse.fr>
Olaf, Pierre-Antoine,

I must say I understand very much Thomas's confusion, as I am myself 
more confused than ever about RDF* now.

Sometimes, it is said that RDF*'s primary goal is to connect the 
Property Graph world with the RDF world; sometimes, it is said that the 
goal of RDF* is to do "reification made right"; now RDF* is said to 
allow people to talk about triples themselves. While some of these goals 
may overlap, they are not the same goal, and they may clash in many cases.

In particular, if the goal is to allow people to talk about triples, 
rather than some other resources assumed to stand for the triple in some 
sense, then the interpretation of << :s :p :o >> should be the triple 
(:s, :p:, :o). Currently, the formal semantics just says that << :s :p 
:o >> is a kind of identifier for a resource of any kind. Two triples:

<< :s1 :p1 :o1 >> and << :s2 :p2 :o2 >>

can identify the same resource, and can be identifying people, flowers, 
buildings, classes, etc.; and embedded triples with blank nodes don't 
denote anything; they just are some kind of templates for ground 
triples, that in turn identify resources.

If, however, embedded triples are just ways to have the actual triple in 
the syntax (as opposed to a structure made of multiple triples that 
somehow is assumed to represent it), then ok, but then it does not mean 
that people are talking about triples when they use RDF* embedded 
triples. They may be using it for that purpose, but nothing prevent 
someone else to interpret it differently.

The property graph use case seems to be pointing to an interpretation of 
embedded triples that is scoped to the graph in which they appear. E.g., 
one may use a property graph arc:

<s> <p>[creator: me] <o> .

in one graph database, and someone else:

<s> <p>[creator: you] <o> .

in some other database. In any case, the triple <s> <p> <o> is always 
the triple <s> <p> <o> (so you can use the same triple in the abstract 
syntax), but the annotation in the first case is specific to the 
occurrence of <s> <p> <o> in the first database, while the annotation of 
the second is specific to the occurrence in the second database. This is 
pretty much in line with Thomas's comment and also in line with many 
examples given in documents that present RDF* (as shown by Peter in his 
latest email).


Le 03/12/2020 à 00:52, Olaf Hartig a écrit :
> Hi Thomas, Pierre-Antoine,
> I thought I had mentioned that before, but maybe not, or maybe not clearly enough. So, here is my notion of what a triple is.
> My understanding of a "triple" really is a triple, in the mathematical sense. That is, what distinguishes a triple from other triples is the unique combination of its three values (which, when talking about RDF triples, we may call the triple's subject, predicate, and object). In other words, if you write down a triple on a sheet of paper and I do so too, and we both wrote the same subject as well as the same predicate, and also the same object, then we have written down one and the same triple.
> So, Thomas, I believe that this is what you call triple type, but I may have mixed up your terminology here (following these long threads is quite a challenge for me these days, sorry!)
> Based on this notion of the word triple, the core idea of the RDF* approach has always been (a) to enable users to talk about such a triple and (b) to use the triple itself when doing so (instead of using any other thing that would be meant to denote the triple).
> Best,
> Olaf
> -----Original Message-----
> From: "thomas lörtsch" <tl@rat.io>
> To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
> Cc: public-rdf-star@w3.org
> Sent: Wed, 02 Dec 2020 23:07
> Subject: Re: RDF* vs RDF vs named graphs
>> On 2. Dec 2020, at 22:02, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote:
>> On 02/12/2020 01:33, thomas lörtsch wrote:
>>>> Features that have been stable from the very beginning (e.g. "abstract" triples rather than triple occurrences)
>>> How do you support this claim?
>> from Olaf Hartig and Bryan Thompson: Foundations of an Alternative Approach to Reification in RDF. In CoRR abs/1406.3399, Jun. 2014 (https://arxiv.org/pdf/1406.3399.pdf)
>> Definition 1:
>>    Assume pairwise disjoint sets I (all IRIs), B (blank nodes), and L (literals).
>>    Let T* be an (infinite) set of tuples that is defined recursively as follows:
>>      1.T* includes all RDF triples, i.e.,T* ⊇ (I ∪ B)× I ×(I ∪ B ∪ L); and
>>      2. if t ∈ T* and t' ∈ T*, then (t, p, o) ∈ T*, (s, p, t) ∈ T* and (t, p, t') ∈ T*
>>          for all s ∈ (I ∪ B), p ∈ I, and o ∈(I ∪ B ∪ L).
>>    Any tuple (s, p, o)∈ T* is an RDF* triple. A set of RDF* triples is called an RDF* graph.
>> According to this definition, there is no way to distinguish two triples with the same subject, predicate and object... They are one and the same. So they can not represent triple occurrences, they can only represent "abstract" triples.
> Maybe there are two definitions of "abstract" triples going around: up to now I understood the term as referring to a triple on the syntactic level, not interpreted, not part of the world the interpretation of the syntax describes. I think you call this the intension of a triple in your proposed semantics. The definition above as you explain it (and honestly, I’m just taking your word for it) describes the extension of a triple type: all occurrences of a triple. If that is your "abstarct" triple, then what do you call what I called "abstract"? A, right, the "triple (type)". Well, I’m not convinced that your use of the term "abstract" reflects common usage, but let’s for the moment just asume it does.
>> That being said, I grant you that the example in that same paper, is poorly chosen:
>>      :bob foaf:name "Bob" .
>>      <<:bob foaf:age 23>>
>>          dct:creator <http://example.com/crawlers#c1>;
>>          dct:source <http://example.net/listing.html>.
>> This modelling is brittle, because it breaks as soon as you have two occurrences of the same triple. That's unfortunate.
> It reflects common expectations on and usage of reification. Therefor it is not unfortunate, but maybe it is misleading. Or the definition doesn’t get it right. Or the definition doesn’t want to go the extra mile to properly describe an occurrence (the graph issue that Olaf doesn’t want to tackle to keep things simple) but the example wants to evoke that it did all the same. Maybe this is just sloppy, maybe it is false advertising, in any case: it is wrong.
>>  From this, you may conclude that the example rules, and that the definition is broken. Or, that the definition rules, and that the example is broken (or at least, brittle). Traditionally (in scientific litterature or standards), definitions win over examples...
> On the web practice wins and if your standard claims to cover practice but doesn’t then the standard is broken, and looses.
> But you who designs a formal semantics have a responsibility to capture what people mean, not the legalese of a definition in a 6 years old paper. And if the definition doesn’t match the example you have to raise an issue and not just go with the fineprint in the definition.
>> Furthermore, it is not too hard to fix the example to match the definition (add an intermediate node to represent the occurrence).
> Which is exactly what nobody wants to do. You can model everything with extra triples. Adding extra triples to get it right is not the value propositin of RDF*. Fail.
>> The opposite would be trickier -- and would break all existing implementations.
> Instead you seem to think it’s easier to break all existing usage.
> And besides: the claim w.r.t. all existing implementations is dubious but I’ll let the practitioneers in this round speak to that.
>> With that, I rest my case ;)
> There’s a german saying that on open sea and before a court we are all in god's hands. But apart from that I’d say: you most certainly loose. Because in the interpretation you give of the definition in Olaf’s paper (and again I assume you are right) the tail wags with the dog. And the semantics you design may be a nice and finessed car, but it goes into the wrong direction.
> As I said before:
>>> RDF* is about annotating statements that actually have been stated. It’s poster child usecase, capturing provenance, only makes sense for occurrences. It positions itself as an alternative to RDF reifcation, singleton properties and singleton graphs - all approaches that target occurrences.
>>> Everything here screams 'occurrence'.
> There are cases where one wants to speak about all occurrences of a triple type, like
> << :world :is :flat >> :is :nonsense
> but that’s hardly a common case.
> And the special case of unasserted assertions:
>>> There is one exception: the case when in SA mode an embedded statement doesn’t reflect an actual occurrence. But this is a corner case, a niche usecase - although definitely an important one IMO.
> This referred to what I understand as 'abstract' but what you probably call "triple (type)".
> This makes 3 different cases, your semantics covers 2 but the 1 that everyone expects is not covered. This is broken by design and I don’t care if Olaf’s paper or your semantics is to blame. This just sucks and all the apologetic references to teh many implementations suck as well because I bet that implementations implement what they think is expeceted by their users and not what the definition says that probably not only I find hard to comprehend. Or they don’t care so much as long as it works on the surface and that’s just how it is, and it is pretty excusable if the example encourages them.
> Thomas

Antoine Zimmermann
Institut Henri Fayol
École des Mines de Saint-Étienne
158 cours Fauriel
CS 62362
42023 Saint-Étienne Cedex 2
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
Member of team Connected Intelligence, Laboratoire Hubert Curien
Received on Thursday, 3 December 2020 09:50:31 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 3 December 2020 09:50:32 UTC