Re: Some notes on the RDF‐star examples of profiles from Thomas Lörtsch on 2024-06-05 (public-rdf-star-wg@w3.org from June 2024)

From: Thomas Lörtsch <tl@rat.io>
Date: Wed, 5 Jun 2024 19:31:25 +0200
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-Id: <C7BBA656-D69F-4523-A4A1-5D137D1E3ABE@rat.io>
> On 5. Jun 2024, at 16:42, Franconi Enrico <franconi@inf.unibz.it> wrote:
> 
> Thanks Thomas for the useful comments.
> 
>> But first, a general question: is it considered possible to have
>> << :e1 | ' :s :p :o ' >>
>> refer to a referentially opaque triple and
>> << :e1 | :s :p :o >>
>> (using the same reifier) to refer to a referentially transparent triple or would they need to have different reifiers?
> 
> This is a very important question that requires a well thought answer.
> I personally would say yes. But I need more time to write down a proper justification for it. Maybe others may have opinions on this.
> Note that the current definitions allow for this.

Since I only asked the question I’ll give my opinion now ;-) I think it’s dangerous to allow both terms to have the same identifier. They have very different consequences when interpreted, so they shouldn’t have the same identifier. Otherwise I couldn’t be sure what a reifier refers to when re-using it in annotations, e.g.:
    << :e1 | ' :s :p :o ' >> :a :b .
    << :e1 | :s :p :o >> :c :d .
    :x :y :e1 .
Of course, I see that this might make the definitions even more complicated than they already are.

>> Example 1
>> =========
>> I see no indication why the triple terms in this example have to be opaque - at least no other than the opposition to many-to-many reifications.
> 
> The motivation comes from LPG encoding, not from the fraud example.
> In LPG an edge identifier is associated exactly and uniquely to a subject,object pair of node identifiers. These identifiers are "static" and uniquely created at assertion time, and they do not denote anything, not frauds, not transactions, not accounts. Just strings, no semantics. They don't change over time.
> In order to capture this, we need functionality and opacity.

Okay, well put, and understood.

> But please, challenge us on this: we need to sustain any potential future critique!

I think the required clarity can be achieved by further refining the examples - see below. 
I also fear a slew of opaque data - because it’s just easier to forget the extra step to convert LPG id's to sound RDF identifiers - and a mix of transparent and opaque data where it is very much left to users to decide if the opaque data should be considered transparent (because one seems to know what the original author meant) or not. In other words: c.h.a.o.s :)

>> If referentially opaque triple terms would become the standard way to represent LPG data in RDF (or if the recommendation would even suggest such an approach) we would probably shoot ourselves in the foot big time.
> 
> I agree: I would oppose to have rdf-star with only opaque triple terms.
> 
>> A conversion of LPG to RDF that looses some or even much of the integration capabilities of RDF is not a compelling value proposition.
> 
> Remember that if rdf-star will have also transparent triple terms, modellers may decide, while converting their lpg into rdf, to semantically enrich their original lpg model, by adding semantics and denotation in the conversion process, and therefore using transparent triple terms. In this way, the rdf graph will be a starting point for a much richer domain representation wrt the original lpg. This is also a big selling point for rdf-star!

Okay, that does indeed seem like a useful perspective. So accompanying explanations should clarify that a direct conversion can only be referentially opaque since LPG have no semantics, but a conversion that wants to tap into the special powers of RDF wrt integration and reasoning (and that can be successfully used together with proper RDF data) will have to go the extra mile and add proper referentially transparent identifiers, i.e. with denotation and semantics.


>> Example 4
>> =========
>> I don’t see how the entailment could be problematic. Indeed both :mary and :paul stored a statement to the effect that (i.e. meaning that) :liz :married :richard. What’s not to like?
> 
> They store syntactic objects (namely specific triples), not their meaning.
> 
>> How is that a feature except in cases where one bothers about the syntactic representation of referers?
> 
> But that is exactly the point.
> 
>> We have a large collection of use cases, scrupulously examined precisely to find out if they profit from or even need referential opacity, and the result was: in the overwhelming majority they don’t need or profit from referential opacity, they’d rather suffer from it. We should stick to that result.
> 
> Correct. But we have identified at least two important use cases calling for opacity: lpg and annotation of syntactic objects.

I get it now that the examples aren’t finished and use not necessarily useful use cases to demonstrate technical aspects.

I think in the end/spec examples should then take care to show how
- in general a referentially opaque conversion looses useful expressivity, but
- in special cases that require syntactic fidelity it is indeed the right choice also from the perspective of RDF

That should resolve a lot of my criticisms.

>> Example 6
>> =========
>> How would one argue that adding
>> << _:w3 | :BCHRmarriage :in :Fayetteville >> :starts 1975 .
>> is not legal? According to example 5 one can’t, and I agree with that. In that sense example 6 is slightly misleading / may lead to wrong assumptions.
> 
> I don't get it.
> << _:w3 | :BCHRmarriage :in :Fayetteville >> :starts 1975 .
> would be consistent (and legal) with examples 5 and 6.
> What is your point here?
> Maybe we should better explain and/or work out the examples, so that we all agree on the meaning they should convey.

Yep, that’s what I tried to illustrate.

>> Example 7
>> =========
>> This is in fact a compelling example of how referential opacity can be useful. However, as argued before elsewhere, why not avoid all the baggage and problems  with referentially opaque reifications and go with RDF literals. E.g.:
>> 
>> << lvn:t1 | :liz :married :richard >> a :TripleToken .
>> lvn:t1 :added-in lvn:rdf-registry ; :stored-by :john .
>> lvn:t1 :on-date "2006-03-12Z"^^xsd:date .
>> lvn:t1 :provenance lvn:database-1975-123973q2 .
>> lvn:t1 :originalSyntacticRepresentation ":liz :married :richard"^^rdf:ttl
> 
> I believe that this statement is not capturing what you mean:
> << lvn:t1 | :liz :married :richard >> a :TripleToken .
> Being transparent the triple term is subject to interpretation and is suffering the issues of example 3.

I believe that my example is good enough to satisfy practical needs and avoids the hassle and hazard of referentially opaque triple terms. It certainly fulfills the need to capture syntactic detail in original representations. It doesn’t suppress entailments - which in general should be considered a feature. If entailments are to be suppressed just omit the first line of my example.

W.r.t. to conversions of LPG data to RDF we might have to take a closer look. I reckon that a conversion of all LPG id’s to just strings is not a viable option and therefore not worth discussing: for starters it would introduce literals in subject position. In general we can expect a conversion of LPG id’s to RDF IRIs or blank nodes, right? In that case we can also assume that the RDF identifiers created in such conversions do indeed carry some meaning (or why shouldn’t they). So, contrary to what you outlined above and what I agreed to, in practice the LPG-to-RDF data will probably be much more RDF-y than we assumed above (which would be a good thing). If that’s indeed the case then it would be a waste to not take advantage of those (maybe weak, but still) semantics of converted identifiers. Ergo referentially transparent triple terms would be a reasonable approach, and then again documenting their original syntactic state in an additional RDF literal is good enough and the whole approach pretty practical.
We should look for more experience, e.g. from the Neptune team. I’m obviously just guesstimating.

>> I noted that some guys from Amazon (Olaf, Gregory, Ora, Bryan, and others) just published a paper at ESWC where they propose the use of datatyped literals to encode - and query, and update! - lists and maps in RDF [0]. I’m not yet sure if those literals are meant to be interpreted as referentially opaque but I imagine that would be easy to add/clarify.
> 
> Yes, they have to introduce this proposal to us.
> 
>> Example 8
>> =========
>> There’s some confusing co-denotation (and/or lack thereof) wrt token identifiers which I gues needs a second look, but I didn’t want to mess with the page (i.e. am afraid of interacting with Git).
> 
> Please do.
> 
>> Example 9
>> =========
>> A) I don’t get how the title of the example corresponds to its content. Doesn’t it rather want to illustrate the opposite?
> 
> Nope.
> 
>> B) Charlie’s inference in the last paragraph is unfounded because the relation stated in c. is incorrect. <alice.ttl> speaks about two annotations, <bob.ttl> about only one.

Oups, I missed the fragment identifiers in the example, '<alice.ttl#w1> owl:sameAs <bob.ttl#w64>.' so my point is moot.

>> So they hardly can be considered sameAs. Any further entailments based on such a sameAs statement is likely to produce erroneous results, in this case mixing attributes from two different annotations.
> 
> I guess you didn't get the gist of the example.
> It says that using functional opaque annotations to describe reifications leads to horrible consequences, and I am sure you will agree!

In principle of course ;-) but you’re right: I still don’t fully get the example. Where does it convey opacity? Is the subject of 'rdf:annotationOf' always referentially opaque?

Best,
Thomas

 
> Cheers
> --e.
Received on Wednesday, 5 June 2024 17:31:35 UTC