Re: RDF* vs RDF vs named graphs from Pierre-Antoine Champin on 2020-12-17 (public-rdf-star@w3.org from December 2020)

From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Date: Thu, 17 Dec 2020 14:54:41 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, public-rdf-star@w3.org
Message-ID: <e44207b4-1a94-b75a-0126-2300d000450e@ercim.eu>
Peter,

in issue #64 (https://github.com/w3c/rdf-star/issues/64) you wrote:

 > central examples have fatal flaws if embedded triples are unique

As you previously made a list of such flawed examples (thanks for that), 
I'll try to explain why I think that these examples are, though 
imperfect, not /fataly/ flawed.

On 03/12/2020 00:47, Peter F. Patel-Schneider wrote:
> I certainly agree with Thomas that examples used throughout the RDF* documents
> and discussions are ill-supported by the various formal definitions underlying
> RDF*.
>
> We see
>
> :bob foaf:name "Bob" .
> <<:bob foaf:age 23>>
>    dct:creator <http://example.com/crawlers#c1> ;
>    dct:source <http://example.net/listing.html> .
>
> in http://ceur-ws.org/Vol-1912/paper12.pdf

Assuming that the <<...>> notation represents unique triples, this 
examples conveys the following information: 1) bob is 23, 2) the fact 
that bob is 23 was asserted by #c1, and 3) the fact that bob is 23 was 
found in listing.html . It is tempting to infer that it was #c1 who 
found this information in that document, but that's not what the example 
is saying. This can be regretted, but that does not make the example 
useless or wrong...

It is not fatally flawed, because IF someone wanted to convey richer 
information such that "#c1 found this triple in that document", this 
would be possible by introducing an additional node, representing the 
occurrence of the triple in the document.

That being said, I agree that the example is imperfect because:

* it can easily to the over-interpretation I mentioned above, and

* the choice of the dct:creator predicate is arguable (nobody "creates" 
a triple, it is an abstract mathematical construct that "exists", 
regardless of who asserts it or not).


> <<:painting :height 32.1>>
>    :unit :cm;
>    :measurementTechnique :laserScanning;
>    :measuredOn "2020-02-11"^^xsd:date.

Granted, this example is very misleading (or mislead). 
":measurementTechique" can hardly be argued to be a property of the 
triple (more of the measurement that lead to assert this triple). This 
should have looked more like:

   <<:painting :height 32.1>>
     :unit :cm;
     :measurement [
         :technique :laserScanning;
         :when "2020-02-11"^^xsd:date
     ].

This revised example shows, I believe, that <<...>> denoting unique 
triple is not an obstacle to solving this use case.

> <<:man :hasSpouse :woman>>
>    :source :TheNationalEnquirer;
>    :webpage <http://nationalenquirer.com/news/2020-02-12>;
>    :retrieved "2020-02-13"^^xsd:dateTime.
>
> in https://graphdb.ontotext.com/documentation/9.2/free/devhub/rdf-sparql-star.html

Again, this example is very misleading, because clearly the intention is 
to convey the information that "this triple was retrieved from the given 
page on a given date" (and not "... from the given page, but also on a 
given date"). If we were to represent two distinct retrieval, we would 
lose the link between source and date.

However, I still believe it is possible to convey this information using 
*unique triples*, either with an intermediate node representing the 
retrieval:

   <<:man :hasSpouse :woman>> :occurence [
     :source :TheNationalEnquirer;
     :webpage <http://nationalenquirer.com/news/2020-02-12>;
     :retrieved "2020-02-13"^^xsd:dateTime
   ].

or possibly with deeply nested triples

   <<:man :hasSpouse :woman>>
     :retrievedFrom
     <http://nationalenquirer.com/news/2020-02-12>
     {| :on "2020-02-13"^^xsd:dateTime |}.
   <http://nationalenquirer.com/news/2020-02-12> dct:creator :TheNationalEnquirer;


> <<:Bess_Schrader :employedBy :Enterprise_Knowledge . >> :dateAdded "2020-05-22" .
> <<:Bess_Schrader :employedBy :Enterprise_Knowledge . >> :addedBy :user_bscrader .
>
> in https://enterprise-knowledge.com/rdf-what-is-it-and-why-do-i-need-it/

This example is interesting because in the code, the author used two 
different occurences of the <<...>> notation, but in the accompanying 
figure, a single :employedBy arc is annotated by the two properties 
:dateAdded and :addedBy.

I think it demonstrate that the author (and, I would venture to 
extrapolate, many people starting with RDF*) did not really think about 
the type/token distinction, or the subtle problems that may arise when 
they are mixed up. I think the problem is more that one, rather than 
"everyone assumes that embedded triples represent occurrences".

> <<?c a rdfs:Class>> dct:source ?src ;
>      prov:wasDerivedFrom <<?c a owl:Class>> .

For me, this example is really similar to the first one: it states "T1 
appears in src, and T1 can be derived from T2". Both assertions are true 
independantly of each other, and can be considered true of the triples 
themselves, rather than occurrences thereof.

As in the very first example, the chosen predicate (here, 
prov:wasDerivedFrom) is not the best choice (PROV is about artifacts, 
not abstrac things like triples), and I propose to change it for a more 
neutral triple (:canBeDerivedFrom).

> :loisLane :believes << :superman :can :fly >>.
I don't see the problem here. Of course, one could argue that [my belief 
that superman can fly] and [lois' belief that superman can fly] are 
different things, but I could just as well argue that Lois and I believe 
*the same thing*. Maybe if the predicate was named :believesThat would 
it be clearer that this example uses the second option?
>
> in https://w3c.github.io/rdf-star/rdf-star-cg-spec.html
>
>
>
> What should be concluded from this?  Just about the most charitable conclusion
> is that RDF* is unsuitable for its claimed use.

I don't think that is very charitable ;-), nor really fair. At least 
that's what I tried to show above.

What I conclude, though, is that RDF* is easily misued, and that the CG 
report should include material to help people avoid these caveats. I'll 
make a PR to that effect.

   best

>
> So what is RDF* good for?  I am concerned about this.
>
>
> peter
>
>
>
>
Received on Thursday, 17 December 2020 13:54:47 UTC