Re: RDF* vs RDF vs named graphs from Pierre-Antoine Champin on 2020-12-18 (public-rdf-star@w3.org from December 2020)

From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Date: Fri, 18 Dec 2020 15:45:07 +0100
To: thomas lörtsch <tl@rat.io>
Cc: public-rdf-star@w3.org
Message-ID: <2fc0ff2c-f370-40f7-175f-b6aa75819f72@ercim.eu>
Thomas,

I apologize if I sounded disrespectful and patronizing. That was not at 
all my intention.

As for the position I was trying to defend, I don't consider it as "my 
semantics". I sincerely believe that this position is shared by a number 
of people on the list -- as I am sure you do about the position you are 
defending.

   best

On 18/12/2020 12:16, thomas lörtsch wrote:
> Pierre-Antoine,
>
>
> you’re completely missing the point.
>
>> On 17. Dec 2020, at 14:54, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote:
>>
>> Peter,
>>
>> in issue #64 (https://github.com/w3c/rdf-star/issues/64) you wrote:
>  From this sentence
>
>>> central examples have fatal flaws if embedded triples are unique
> you only take the first half and then go one to show how the "regrettable", "unfortunate" examples can be saved to fit your semantics. You introduce new blank nodes and indirections as if the authors didn’t know what they are doing and had to be tought basic RDF modelling skills. You even replace well-established properties by something you invented ad-hoc with a condiserably different meaning. All the examples rather obviously understand embedded triples as occurrences but you consequently treat them as wrongly modelling unique triples.
>
> Notwithstanding the lack of respect and the patronizing attitude, what you actually show is that the semantics you propose don’t cover those usecase. Which is the point that has repeatedly been made. Calling that a possible misuse of RDF* is, well, an interesting perspective. My fear is that the world will not bend to your semantics and that the ensuing muddle will not profit anybody.
>
> This is not to say that there is no case for unique triples: there is and not long ago I was overly focused on the annotation usecase that understands them as occurrences. But both usecases are vaild, widely used and advertized as to be solved by RDF*. Property graphs can do both without saying as they have no semantics, but in RDF they have to be catered for.
>
> And one last thing: if you insist on only covering the unique triples reading then you should drop everything reification related. Drop SA mode and drop the new node type because what you are really doing is defining semantic sugar for n-ary relations, and those are covered by rdf:value.
>
> :a :b :c {| :d :e |}
>
> is then syntactic sugar for
>
> :a :b [
>    rdf:value :c ;
>    :d :e
> ]
>
> That has a clear unique triple semantics, true to the flat world ideal of RDF. It would spare us a whole lot of trouble and avoid any confusion. It would not cover the annotation usecase and anything that requires a bit more complexity than the simplistic base of RDF but since you’re not planning to actually support that anyways, why not at least be honest about it!
>
> Of course it would be a wasted opportunity but since you and Olaf seem to be so heavily inclined…
>
>
> Thomas
>
>
>> As you previously made a list of such flawed examples (thanks for that), I'll try to explain why I think that these examples are, though imperfect, not fataly flawed.
>>
>> On 03/12/2020 00:47, Peter F. Patel-Schneider wrote:
>>> I certainly agree with Thomas that examples used throughout the RDF* documents
>>> and discussions are ill-supported by the various formal definitions underlying
>>> RDF*.
>>>
>>> We see
>>>
>>> :bob foaf:name "Bob" .
>>> <<:bob foaf:age 23>>
>>>    dct:creator
>>> <http://example.com/crawlers#c1>
>>>   ;
>>>    dct:source
>>> <http://example.net/listing.html>
>>>   .
>>>
>>> in
>>> http://ceur-ws.org/Vol-1912/paper12.pdf
>> Assuming that the <<...>> notation represents unique triples, this examples conveys the following information: 1) bob is 23, 2) the fact that bob is 23 was asserted by #c1, and 3) the fact that bob is 23 was found in listing.html . It is tempting to infer that it was #c1 who found this information in that document, but that's not what the example is saying. This can be regretted, but that does not make the example useless or wrong...
>>
>> It is not fatally flawed, because IF someone wanted to convey richer information such that "#c1 found this triple in that document", this would be possible by introducing an additional node, representing the occurrence of the triple in the document.
>>
>> That being said, I agree that the example is imperfect because:
>>
>> * it can easily to the over-interpretation I mentioned above, and
>>
>> * the choice of the dct:creator predicate is arguable (nobody "creates" a triple, it is an abstract mathematical construct that "exists", regardless of who asserts it or not).
>>
>>
>>
>>> <<:painting :height 32.1>>
>>>    :unit :cm;
>>>    :measurementTechnique :laserScanning;
>>>    :measuredOn "2020-02-11"^^xsd:date.
>>>
>> Granted, this example is very misleading (or mislead). ":measurementTechique" can hardly be argued to be a property of the triple (more of the measurement that lead to assert this triple). This should have looked more like:
>>
>>    <<:painting :height 32.1>>
>>      :unit :cm;
>>      :measurement [
>>          :technique :laserScanning;
>>          :when "2020-02-11"^^xsd:date
>>      ].
>>
>> This revised example shows, I believe, that <<...>> denoting unique triple is not an obstacle to solving this use case.
>>
>>> <<:man :hasSpouse :woman>>
>>>    :source :TheNationalEnquirer;
>>>    :webpage
>>> <http://nationalenquirer.com/news/2020-02-12>
>>> ;
>>>    :retrieved "2020-02-13"^^xsd:dateTime.
>>>
>>> in
>>> https://graphdb.ontotext.com/documentation/9.2/free/devhub/rdf-sparql-star.html
>> Again, this example is very misleading, because clearly the intention is to convey the information that "this triple was retrieved from the given page on a given date" (and not "... from the given page, but also on a given date"). If we were to represent two distinct retrieval, we would lose the link between source and date.
>>
>> However, I still believe it is possible to convey this information using *unique triples*, either with an intermediate node representing the retrieval:
>>
>>    <<:man :hasSpouse :woman>> :occurence [
>>      :source :TheNationalEnquirer;
>>      :webpage
>> <http://nationalenquirer.com/news/2020-02-12>
>> ;
>>      :retrieved "2020-02-13"^^xsd:dateTime
>>    ].
>>
>> or possibly with deeply nested triples
>>
>>    <<:man :hasSpouse :woman>>
>>      :retrievedFrom
>>
>> <http://nationalenquirer.com/news/2020-02-12>
>>
>>      {| :on "2020-02-13"^^xsd:dateTime |}.
>>
>> <http://nationalenquirer.com/news/2020-02-12>
>>   dct:creator :TheNationalEnquirer;
>>
>>
>>
>>> <<:Bess_Schrader :employedBy :Enterprise_Knowledge . >> :dateAdded "2020-05-22" .
>>> <<:Bess_Schrader :employedBy :Enterprise_Knowledge . >> :addedBy :user_bscrader .
>>>
>>> in
>>> https://enterprise-knowledge.com/rdf-what-is-it-and-why-do-i-need-it/
>> This example is interesting because in the code, the author used two different occurences of the <<...>> notation, but in the accompanying figure, a single :employedBy arc is annotated by the two properties :dateAdded and :addedBy.
>>
>> I think it demonstrate that the author (and, I would venture to extrapolate, many people starting with RDF*) did not really think about the type/token distinction, or the subtle problems that may arise when they are mixed up. I think the problem is more that one, rather than "everyone assumes that embedded triples represent occurrences".
>>
>>> <<?c a rdfs:Class>> dct:source ?src ;
>>>      prov:wasDerivedFrom <<?c a owl:Class>> .
>>>
>> For me, this example is really similar to the first one: it states "T1 appears in src, and T1 can be derived from T2". Both assertions are true independantly of each other, and can be considered true of the triples themselves, rather than occurrences thereof.
>>
>> As in the very first example, the chosen predicate (here, prov:wasDerivedFrom) is not the best choice (PROV is about artifacts, not abstrac things like triples), and I propose to change it for a more neutral triple (:canBeDerivedFrom).
>>
>>> :loisLane :believes << :superman :can :fly >>.
>> I don't see the problem here. Of course, one could argue that [my belief that superman can fly] and [lois' belief that superman can fly] are different things, but I could just as well argue that Lois and I believe *the same thing*. Maybe if the predicate was named :believesThat would it be clearer that this example uses the second option?
>>> in
>>> https://w3c.github.io/rdf-star/rdf-star-cg-spec.html
>>>
>>>
>>>
>>>
>>> What should be concluded from this?  Just about the most charitable conclusion
>>> is that RDF* is unsuitable for its claimed use.
>>>
>> I don't think that is very charitable ;-), nor really fair. At least that's what I tried to show above.
>>
>> What I conclude, though, is that RDF* is easily misued, and that the CG report should include material to help people avoid these caveats. I'll make a PR to that effect.
>>
>>    best
>>
>>> So what is RDF* good for?  I am concerned about this.
>>>
>>>
>>> peter
>>>
>>>
>>>
>>>
>>>
Received on Friday, 18 December 2020 14:45:12 UTC