Re: Semantic Predication: 1 - basic distinctions from Thomas Lörtsch on 2023-02-22 (public-rdf-star-wg@w3.org from February 2023)

From: Thomas Lörtsch <tl@rat.io>
Date: Wed, 22 Feb 2023 15:09:18 +0100
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: RDF-star WG <public-rdf-star-wg@w3.org>
Message-Id: <4DA17618-1621-4CF1-8968-E5436675F996@rat.io>
Hi Enrico,


please excuse the late response!

> On 16. Feb 2023, at 16:01, Franconi Enrico <franconi@inf.unibz.it> wrote:
> 
> On 15 Feb 2023, at 17:13, Thomas Lörtsch <tl@rat.io> wrote:
> 
>> Hi Franco,
> 
> Hi Löri (just kidding, I’m Enrico and you are Thomas!),

Arrrgh! So sorry! Please feel free to call me anythig you want ;-)

>> I believe that your distinction between "semantic" and "modal/epidemic" predications (*) is too brittle to be workable in practice. 
>> For example you consider ":accordingTo" an example for modal/epistemic predication. However, :accordingTo to me sounds very much like :source, :src, :said ... - all those properties that we make up regularily to illustrate provenance use cases. And provenance is so often used to illustrate examples because in our intuition it is far from modifying the meaning of a statement, let alone make it the subject of a modality or believe system. Granted, that intuition is not well-founded, as the source of a statement can make all the difference e.g. in court - but that just confirms my point: the use case determines if a source is just administrative detail or a central piece of information. In any case, on itself it just conveys a description of the source. Descriptions is all there is in RDF. You can define certain properties like ":believes" or ":accordingTo" to have special meanings and employ a semantic extension to interpret them accordingly and drive entailments outside of RDF/S, OWL etc. Only then are you entering the realm of modalities. 
>> In general I think such attempts at classification most of the time are intuitive only to their creators ;-)
> 
> Not really.
> There is a long tradition in semantics (from philosophy, to natural language, lexical semantics, and logics) where this distinction is well characterised.
> Consider the examples:
> 
> Semantic predication:
> <<< :john :teaches :cs101 >>> rdf:type :teaching ;
>                              dct:Location dbr:Stanford_University ;
>                              dct:PeriodOfTime :1st-term-2022 .
> 
> Modal/epistemic predication:
> <<<< :john :teaches :cs101 >>>> :believed_by :employee22 .
> 
> (note that for clarity I changed :accordingTo with :believed_by)

See below.

> In both cases, the embedded triple denotes an instance of :teaching. 
> In the former case, this instance of :teaching exists in the current world (i.e., the default graph) where it also have additional properties, and moreover the embedded triple itself is asserted (i.e., it is true) in the current world (i.e., the default graph).
> In the latter case, this instance of :teaching exists only in the world within the mind of :employee22, and the embedded triple itself is not asserted (i.e., it is not true) in the current world (i.e., the default graph). The embedded triple denotes a statement, which is meant to be true in the mind of :employee22; that’s why we say that in the current world (i.e., the default graph) the denotation of the embedded triple is not an instance of :teaching, but an instance of unstar:statement.

Imagine an approach that defines :believed_by as a subproperty of both :source and a reliability value of 60%, turning 

<<<< :john :teaches :cs101 >>>> :believed_by :employee22 .

into

<< :john :teaches :cs101 >> :source :employee22 ;
                            :reliability 60% .

or

<< :john :teaches :cs101 >> :source :employee22 .
:employee22 :reliability 60% .

It depends very much on your application how you want to handle the statement issued by John.  Clearly one can develop machinery to deal with grades of reliability/credibility/probability/certainty, but we all agree that that is outside the scope of RDF. One might choose to store everything not 100% certain into a separate (named) graph. But also very clearly one can record such grades as just another part of the description and be done with it. Nothing breaks on the RDF side. If your application breaks because data with only 60% reliability can wreak havoc in your business then you’ve got an application problem and may feel compelled to invest in a modal RDF semantic extension to solve it. Still the description is accurate.

I think the problem is originating from an underlying assumption that RDF statements can or should or even have to be assumed to be unconditionally true. That assumption is unfounded. Some even argue that facts like the marriages between Burton and Taylor or Obama’s presidency should only be recorded as unasserted statements because they are not true _now_. I rather advocate the opposite approach: always be prepared for more information (like "true but not now", "only maybe true", "true for believers", "true until revoked" etc) that may change the meaning of what you have already. That sounds a lot like the Open World Assumption. And the "meaning" is in a sense what you can make of it: if you want all president of the USA you’ll be okay with Obama. If you want the president _now_, you’ll have to dig deeper and search for temporal qualification. My assumption is that users know that and therefore the OWA based interpretation is working just fine and modalities can (and should) safely be delegated to semantic extensions. Any statement that has _some_ truth to it is a good RDF statement. Statements that are sure to be true always and forever are a) hard to come by and b) often not very interesting anyway. 


> Different is the case of provenance:
> 
> << :john :teaches :cs101 >> :recorded "2021-07-07"^^xsd:date .
> 
> when the embedded triple is just a syntactic object, i.e., an instance of unstar:triple and not of :teaching; indeed its meaning is irrelevant in stating the triple’s provenance, creation date, recorded date, source, creator.
> 
> There may be examples which may be perceived as ambiguous, and maybe :accordingTo could be one. I mistakenly thought this could convey clearly a modal/epistemic example, but since apparently this is not the case, I changed it with :believed_by.

But in what category would you now put :accordingTo? Doesn’t this mistake vividly illustrate my point that the precision and agreements required to make this work are beyond what can be expected on the open semantic web? I’m not saying that it can’t be done - it most certainly has been done a lot of times already - but I can’t see how it can be incorporated into the core of RDF. In fact, you say so yourself. But you seem to think that it can still be formalized e.g. via the syntactic features that you propose. But what then? The final step - to decide what to do with unbelievable statements - is probably the easiest. What's hard is to standardize the vocabulary, the values of certainty, the tresholds when something switches from "true" to "false". Would the distinction be between "true" and "not true enough"? Or between "true-ish" and "(at least almost) certainly not true"? That’s all well outside the realm of standard RDF. Ergo you advocate a syntactic feature to support the final decision of an undefined procedure. I have in some prior mail proposed to use graph literals to document statement without asserting them (they should be queryable though) [0]. That I could agree to, also because it has a lot more potential uses. Would you consider that a solution?

>> The other distinction you make, between "syntactic" and "semantic" predications, is IMO justified. I would name it differently though - I prefer FACT and ARTEFACT - and I don’t agree that that syntactic/artefact predications should refer to the referentially opaque version. An example: Carol said that Alice likes Bob and Dan added that fact to the triple store. So:
>> 
>>   << << :Alice :likes :Bob >> :src :Carol >> :src :Dan .
>> 
>> I see no justification for referential opacity. It can be employed, but there is no need and since the repurcussions are subtle but wideranging IMO it shouldn’t - not in this general and passing way.
> 
> Interesting argument. I don’t want to have a strong opinion about the behaviour of syntactic predications. 
> However, I see your example above as purely non-transparent, since both predications (:src and :src) are predications about triples, not about their meaning.

How do you know that? I have the exact opposite intuition! IMO Carol is not concerend about how she refers to Alice and Bob (by eMail-adress, twitter handle, Social Security Number) nor does she differentiate abc:likes from xyz:LIKES, and neither does Dan. All Carol and Dan care about is getting across the meaning of the statement. And why wouldn't they?! Normally on the semantic web you don’t have to worry about such syntactic detail and that’s a feature, not a bug. It fosters interoperability. Stating triples in the referentially trasnparent realm of RDF, but annotating only some very specific representation of them introduces a glass wall that arguably works diametrically against the intuition, practice and requirements of the Semantic Web at large.


> Going back to your question about why I use the word “predication”, from my comment above you can see how the actual meaning of the predicate having an embedded triple as subject or object defines how the embedded triple has to be interpreted: semantically, syntactically, or modally.
> I repeat what I said - I hope that this is clearer now:
> Final comment:
> So, the above examples show that a way to understand which class an occurrence of an embedded triple belongs to, is to ask yourself: does the occurrence of the embedded triple denote an instance of some event/state meaningful in your domain, or it denotes just the occurrence of the triple itself, or it denotes a statement which is meant to be true in the context of the predication?
> 
>> In the last example on semantic predicatins in eMail nr. 2 you use properties ":spouse-1" and ":spouse-2", defined as subproperties of ":spouse". Note that here you are employing the Singleton Property approach and wouldn't need quoted triples at all. But, because quoted triples reference the type, practically all your examples could face the same need to account for a multiplicity of annotations. Ergo Singleton Properties might be the better approach after all.
> 
> I don’t know where to read in order to understand what the Singleton Property is (my fault, sorry…). I suppose this has to do with the problem of multi-edge, which probably is outside the scope of the WG anyway. But it would be cool if I could understand what are you talking about.

Multi-edge IMO is such a common problem that the WG would be well advised to take it very serious.

>> Also, I don’t agree with your take on reification: reification introduces a meta-level, the reification is not the same as what it reifies. Check yourself if when you think you’re reifying a statement, you’re rather creating an instance or subclass of said statement. That is not reification. I’m specifically referring to eMail 4, Example 4.
> 
> I don’t understand; I suspect that this is about lack of terminology alignment among us, so I suggest to not get confused at this stage, and let’s delay a clarifying discussion about this point.

In hindsight I’m not totally sure myself if what I said makes sense. It is important to distinguish a reification (eg an RDF reification quadlet _:x rdf:type rdf:Statement; rdf:subject …; rdf:predicate … etc) from a statement using a subproperty, because the former just _refers_ to a statement whereas the latter _is_ an actual statement. But also the latter can be referred to by a reification to annotate it, so from that end the distinction is mute.

>> I’ve got a different proposal for the identification problem:
>> 
>>    :Alice 1@:likes :Bob .
>>    :1#predicate :source :Carol
>>    :1#triple :source Dan
>> 
>> Annotations on the predicate annotate the fact in the realm of interpretation. Annotations on the triple annotate the artefact, the syntactic representation. This use of fragment identifiers IMHO is in line with web architecture.
>> 
>> I’ve got two more:
>> 
>>    :Alice 1@:likes :Bob .
>>    :1#subject :age 17
>>    :1#predicate :source :Carol
>>    :1#object :coolnessFactor :Rocksinger
>>    :1#triple :source Dan
>> 
>> Note the nicely memorable set of s/p/o/t fragment identifiers and the very interesting increase in expressivity. Carol says that Alice at age 17 liked :Bob because he was the singer of a rockband, and Dan added the fact to the triple store. You’d need two few more blank nodes - for Alice and Bob - to model this in regular RDF, and you’d need to know about those blank nodes when querying.
> 
> A very interesting approach, which could be seriously discussed. I like it.

Thanks! I’m working on a more comprehensive account.
 
> Still, at this stage I prefer we first reach a mutual understanding on distinguishing classes of use cases.

It would sure be good to not mix issues up. OTOH every approach is guided by and optimized for some usage scenario, and "syntax determines consciousness" ;-) 

As I said, I see the distinction between the triple itself as an artefact, or speech act, and the fact that it refers to in the realm of interpretation. This is an issue that is well known as the identity crisis of the semantic web - see the Cool URIs paper [1] - and I’m not usre we should try to tackle it for embedded triples alone. You make a second distinction within the realm of interpretation between "true" and "true under some condition" - and that distinction I think is too blurry and too varied to be captured on the level of RDF, but should be delegated to semantic extensions.


Best,
Thomas


[0] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Feb/0002.html
[1] https://www.w3.org/TR/cooluris/

> cheers
> —e.
> 
> 
>>> On 14. Feb 2023, at 17:24, Franconi Enrico <franconi@inf.unibz.it> wrote:
>>> 
>>> As I said, I’m interested to make sure that in RDF-star there will be a sound characterisation of the class of use cases which I call “semantic predication”. It is clear that they behave homogeneously but differently from the other two classes of use cases which I call “syntactic predication” and “modal/epistemic predication”. For this reason and for simplicity, in order to avoid confusion, at this stage I will use three different notations to identify embedded triples in the three different predication classes; I will argue elsewhere why I believe that this is better than adopting other ways to distinguish the three classes.
>>> 
>>> Let me, again, summarise the difference between these three classes, via three different examples:
>>> 
>>> Semantic predication example:
>>> 
>>> <<< :john :teaches :cs101 >>> rdf:type :teaching ;
>>>                              dct:Location dbr:Stanford_University ;
>>>                              dct:PeriodOfTime :1st-term-2022 .
>>> 
>>> A semantic embedded triple denotes a resource that is meaningful in the domain of interest. In the above example, <<< :john :teaches :cs101 >>> denotes an instance of :teaching. 
>>> 
>>> Syntactic predication example:
>>> 
>>> << :john :teaches :cs101 >> :recorded "2021-07-07"^^xsd:date .
>>> 
>>> A syntactic embedded triple denotes a resource representing the triple itself as a syntactic object. In the above example, << :john :teaches :cs101 >> should denote an instance of something like unstar:triple and not an instance of :teaching. 
>>> 
>>> Modal/epistemic predication example:
>>> 
>>> <<<< :john :teaches :cs101 >>>> :accordingTo :employee22 .
>>> 
>>> A modal/epistemic embedded triple does not denote any meaningful resource in the domain of interest, but it represents a statement which should be true in the context of the predication. For this reason, it would be wrong to let a modal/epistemic embedded triple denote a resource. As a matter of fact, a modal/epistemic embedded triple should denote a set of RDF interpretations, namely all the RDF interpretations in which the modal/epistemic embedded triple is true. This leads to a semantics for RDF-star modal/epistemic embedded triples in the style of modal logics, which clearly can not be adjusted easily as an extension of the RDF 1.1 semantics. However, in the spirit of RDF as a language capable of meta modelling, the modal/epistemic embedded triple  <<<< :john :teaches :cs101 >>>> could still denote a resource, and it would be an instance of something like unstar:statement.
>>> 
>>> Final comment:
>>> So, the above examples show that a way to understand which class an occurrence of an embedded triple belongs to, is to ask yourself: does the occurrence of the embedded triple denote an instance of some event/state meaningful in your domain, or it denotes just the occurrence of the triple itself, or it denotes a statement which is meant to be true in the context of the predication?
Received on Wednesday, 22 February 2023 14:09:39 UTC