Some notes on the RDF‐star examples of profiles from Thomas Lörtsch on 2024-06-05 (public-rdf-star-wg@w3.org from June 2024)

From: Thomas Lörtsch <tl@rat.io>
Date: Wed, 5 Jun 2024 12:47:44 +0200
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-Id: <241B115A-C700-42EC-8B20-FAC3BC72D9B3@rat.io>
Dear all,


I checked (what I think is the most recent version of) the examples in
https://github.com/w3c/rdf-star-wg/wiki/RDF%E2%80%90star-examples-of-profiles and while I do remember that we all were pretty glad to have resolved the conflict around many-to-many reifications, the examples IMO show that the resolution doesn’t carry much weight, but instead is rather harmful.
I also describe some other issues.

But first, a general question: is it considered possible to have 
    << :e1 | ' :s :p :o ' >>
refer to a referentially opaque triple and 
    << :e1 | :s :p :o >>
(using the same reifier) to refer to a referentially transparent triple or would they need to have different reifiers?


And, another first, I also note that all examples speak about unasserted assertions only, i.e. none of them bothers to add the triple terms as actually asserted triples. The need to do this as an extra step is not only a nuisance but also easy to forget or overlook. That is something that increasingly bothers me. Unasserted assertions are a niche use case, but the way RDF-star has been designed by the CG and now the WG puts them front and center and makes standard/asserted triple terms much more tedious to use in practice. I was reminded of that when Bryan recently argued about the triple count of singleton properties. The triple count of what we currently have is roughly doubled by how support for unasserted assertions is designed - not only in storage but also in authoring (and querying IIUC). This is a problem.



Example 1
=========
I see no indication why the triple terms in this example have to be opaque - at least no other than the opposition to many-to-many reifications. To the contrary: to discuss a fraud it is not necessarily essential, and probably often not helpful or even harmful, to suppress co-denotation. For example it might be important to realize that :account-123 and some :account-xyz refer to the same entity. So the attempt to prevent many-to-many reification may actually limit the usefulness of the data. 
I’m really bothered by this. It was hard to understand why Amazon was so opposed to many-to-many reifications and it is hard to understand how this solution would solve their problem. IMO it creates a much bigger one, of course from a different perspective. If referentially opaque triple terms would become the standard way to represent LPG data in RDF (or if the recommendation would even suggest such an approach) we would probably shoot ourselves in the foot big time. A conversion of LPG to RDF that looses some or even much of the integration capabilities of RDF is not a compelling value proposition.

Example 4
=========
I don’t see how the entailment could be problematic. Indeed both :mary and :paul stored a statement to the effect that (i.e. meaning that) :liz :married :richard. What’s not to like?
And contrary to the assumption conveyed be the example’s title I don ’t think that example 3 needs, or even profits, from opacity. It rather suffers from it because in example 3 we can’t entail that :mary and :paul spoke about the same fact. How is that a feature except in cases where one bothers about the syntactic representation of referers? 
We have a large collection of use cases, scrupulously examined precisely to find out if they profit from or even need referential opacity, and the result was: in the overwhelming majority they don’t need or profit from referential opacity, they’d rather suffer from it. We should stick to that result. We should accept that pacifying the many-to-many debate via referentially opaque triple terms won’t do us any good in the long term (in fact, we shouldn’t go that route at all). Referential opacity is useful if one is actually interested in the syntactic representation of referers - and there sure are some use cases for that, e.g. given in example 7. However, in every other case they are not only a nuisance, but outright harmful because they restrict the meaning of what is said in completely counter-intuitive ways. They work diametrically against the core purpose of RDF: concentrating on what is meant, not on how it is expressed. Which is the very basis of decentralized data integration.

Example 6
=========
How would one argue that adding
<< _:w3 | :BCHRmarriage :in :Fayetteville >> :starts 1975 .
is not legal? According to example 5 one can’t, and I agree with that. In that sense example 6 is slightly misleading / may lead to wrong assumptions.

Example 7
=========
This is in fact a compelling example of how referential opacity can be useful. However, as argued before elsewhere, why not avoid all the baggage and problems  with referentially opaque reifications and go with RDF literals. E.g.:

<< lvn:t1 | :liz :married :richard >> a :TripleToken .
lvn:t1 :added-in lvn:rdf-registry ; :stored-by :john .
lvn:t1 :on-date "2006-03-12Z"^^xsd:date .
lvn:t1 :provenance lvn:database-1975-123973q2 .
lvn:t1 :originalSyntacticRepresentation ":liz :married :richard"^^rdf:ttl

I noted that some guys from Amazon (Olaf, Gregory, Ora, Bryan, and others) just published a paper at ESWC where they propose the use of datatyped literals to encode - and query, and update! - lists and maps in RDF [0]. I’m not yet sure if those literals are meant to be interpreted as referentially opaque but I imagine that would be easy to add/clarify.

[0] https://2024.eswc-conferences.org/wp-content/uploads/2024/05/77770226.pdf

Example 8
=========
There’s some confusing co-denotation (and/or lack thereof) wrt token identifiers which I gues needs a second look, but I didn’t want to mess with the page (i.e. am afraid of interacting with Git).

Example 9
=========
A) I don’t get how the title of the example corresponds to its content. Doesn’t it rather want to illustrate the opposite?
B) Charlie’s inference in the last paragraph is unfounded because the relation stated in c. is incorrect. <alice.ttl> speaks about two annotations, <bob.ttl> about only one. So they hardly can be considered sameAs. Any further entailments based on such a sameAs statement is likely to produce erroneous results, in this case mixing attributes from two different annotations.



Best,
Thomas
Received on Wednesday, 5 June 2024 10:47:54 UTC