Re: multisets everywhere from Fabio Vitali on 2021-12-26 (public-rdf-star@w3.org from December 2021)

From: Fabio Vitali <fabio.vitali@unibo.it>
Date: Sun, 26 Dec 2021 17:23:24 +0000
To: "Patrick J. Hayes" <phayes@ihmc.org>
CC: Ted Thibodeau Jr <tthibodeau@openlinksw.com>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-ID: <D8184BB8-9389-4DA9-BB3B-992622AD4228@unibo.it>
Hello. 

>> My understanding has been that the original conception of RDF 
>> was that it would only be used to record universal and eternal 
>> facts; in other words, everything encoded in RDF was universal
>> and eternal truth.
> 
> I think I know what you mean, but I (and other logicians and philosophers of language) would prefer to say "simply true", or "true simpliciter" if you want to sound fancy. Which means, just true (not, say, necessarily true or mathematically true or scientifically true, etc..) but true without some qualification or modification (possibly true, true now but maybe not tomorrow, somewhat true, conditionally true, etc..) The kind of 'true' that you swear to tell when you take an oath in a court of law. 

Differently from logicians, I tend to believe that the simply true facts are not frequent outside of axiomatic domains (and I can just hear Gödel muttering that even in axiomatic universes the issue is far from solved...) 

Basically ANY statement you can express is temporally and geographically bound, up to and including "The Sun rises in the East and sets in the West", which only applies to Earth and to the last several billion years. Simply true facts are the exceptions rather than the norm, and any representation that excludes temporal and geographical constraints from statements is justified by simplification, laziness or irrelevance, not by logic. 

>> , but it was hard enough for many to grasp 
>> the simplicity of describing everything with SPO triples that 
>> it took years for many to realize that few descriptions were 
>> eternally accurate.)
> 
> In what sense? Yes, if you mean that we can discover that we were wrong and that the historical record may need to be corrected, updated (though this is a pretty rare occurrence for most data). No, if you mean that all assertions are somehow time-dependent in the way that tensed language is. 

I disagree. In fact, you are assuming two things in these sentences, and I disagree with both. 

2) "No, if you mean that all assertions are somehow time-dependent"

This whole discussion is about accepting that some, or many (or, I believe, most) assertions "are somehow time-dependent". Our usual :a :marriedTo :b is not an absolute fact, and can only be used wrongly if used in a time-independent fashion: for the majority of the past history of the universe, i.e., before their wedding, :a was NOT married to :b, and for the majority of the remaining history of the universe, i.e., from their divorce onwards, they will also NOT be married. Ignoring it generates problems (how do you differentiate bigamy from remarriage [*]?). This applies to basically everything. 

1) "Yes, if you mean that we can discover that we were wrong and that the historical record may need to be corrected"

You seem to imply that all statements are either right or wrong, and that scholars express everything as true facts until new evidences generates a new truth and the old one is proven wrong. I am afraid that this is not how real scholars work: in most of the fields I deal with, certainties are rare, and it is customary for scholars to express at the same time multiple competing statements as possible, and sometimes advance a personal preference for one of them. There is a clear and evident need to be able to express not just the facts that are (momentarily) considered true, but also the competing ones that we still recognise (momentarily) as false, yet possible and/or reported in literature. Preventing the representation of (momentarily) rejected statements distorts and limits the correct expression of what we know about any field of human knowledge. 

[*] ok, ok. Hardcore catholics don't actually differentiate them...

> So, the second idea is, you introduce a single 'thing' (variously called an event, a situation, a circumstance, a happening, a history, a process, a proposition, a fact, depending on who you think invented it) that all these 'arguments' are related to by binary relations (sometimes called facets or aspects or cases, if you come to this from linguistics). For example, this idea was developed for use by military intelligence applications, where the core 'aspects' are the five W's: Who, What, When, Where and Why. This approach gets you some nice side benefits, apart from its flexibility, because these 'things' can have other properties. In the military intellegence case, for example, they can be classified into various categories of interest or relevance to some strategic goal. If we are interested in legal issues, they can be related to whatever regulations they violate, and so on. 

An n-ary relationship is just ONE way to represent more complex and time- and location-dependent facts. I think that n-ary relationships have their own share of issues and limits: 

1) You have to invent a pseudo-entity which becomes the hub of many binary relationships, proliferating the number of entities and classes we create exactly because the model is too simple. Just for persons, you must invent birthEvents, deathEvents, weddingEvents, divorceEvents, jobState, schoolingState, etc. 

2) You loose the direct connection between the original subject and the original object. You switch from 

:RichardBurton :marriedTo :LizTaylor 

to 

:m1 a Marriage; 
    :groom :RichardBurton;
    :bride :LizTaylor;
    :start "1963"^^xsd:Year;
    :end "1974"^^xsd:Year. 

and suddenly there is no more a direct connection between :RichardBurton and :LizTaylor: 

SELECT ?p WHERE {
 :RichardBurton ?p :LizTaylor . 
}

returns empty. Not nice. 

3) You have to decide whether to represent the temporally-bound states (a marriageState, an employmentState, a political term, a life) or the boundary events around them (weddingEvent, divorceEvent, hiringEvent, firingEvent, birthEvent, the deathEvent, etc.) and there are no guidelines and in fact we would often use both with no homogeneity or justification (why do we often use terms or reigns, which are States, for politicians and kings, but use births and deaths, which are Events, for human lives, and should we use weddings/divorces or marriages?)

4) Sometimes we create classes, sometimes we create n-ary relationships, and it is not clear why. For instance, Wikidata uses classes for geographical constraints, and n-ary relationships for temporal constraints: for instance, in order to say that Gustav III was the king of Sweden between 1771 and 1792, https://www.wikidata.org/wiki/Q52930 defines a class "Monarch of Sweden", which is a subclass of "Monarch" limited to the country "Sweden", and creates a Statement (an n-ary relationship) whereby the position held by Gustav III as Monarch of Sweden is limited between the dates 1771 and 1792. Why the invention of the subclass "Monarch of Sweden" when we already had the temporally limited Statement?

> But there is a third idea, which is to treat some smallish subset of the possible arguments as actual arguments, forming a kind of core fact, and the others as meta-assertions ABOUT this core fact. The simplest possible version of this is the core being a single RDF triple, and anything else is about this triple. There are many issues and problems with this approach, though.


> First, it is basically wrong: these extra arguments or modifications are not 'about' the triple (unlike, say, provenance information).(They might be 'about' the fact asserted by the triple, but that 'fact' is not the triple itself, which is a syntactic entity. In fact, the thing it is about is probably one of those event-things.)

You are assuming that the n-ary relationship exists in some abstract sense, while the triple does not: that the temporal boundaries can only be be a property of the Marriage, and never a constraint on the truth of the statement. I think that the opposite view holds just as well: the triple exist absolutely in an indeterminate state, neither true nor false,, and becomes true within a certain temporal or geographical constraint. It has nothing to do with the entity Marriage (which does not exists outside of our minds), but with the condition for the truth of the statement (which are just as arbitrary and abstract as the concept of Marriage). That does not shock me. 

Remarkably, you do accept that provenance should be about the truth of the triple and not about the marriage. So you are fine with three different modes to express various types of annotations about simple binary relationships: converting  it to an n-ary relationship for some annotations, inventing a hierarchy of variously constrained classes for other ones, quoting it for yet other ones. That confuses me. 

On the contrary, truth conditions about the triple provide a uniform model for temporal, geographical, provenance, confidence constraints, that all use a similar pattern to provide truth conditions for a very minimal binary relationship: 

<<  :RichardBurton :marriedTo :LizTaylor  >> 
    :startDate "1963"^^xsd:Year;
    :endDate "1974"^^xsd:Year;
    :accordingTo :wikipedia;
    :confidence 1.0.  

<<  :GustavIII :positionHeld :monarch  >> 
    :startDate "1771"^^xsd:Year;
    :endDate "1792"^^xsd:Year;
    :for :Sweden;
    :accordingTo :wikipedia;
    :confidence 1.0.  

No invented pseudo-entities, no need to choose between states (Marriage) and events (Weddings), no difference of approach between different types of annotations, a direct connection still exists between your original subject and your original object and can be queried in a reasonably simple manner: 

SELECT ?p WHERE {
 {
  :RichardBurton ?p :LizTaylor . 
 } 
 UNION
 {
  << :RichardBurton ?p :LizTaylor>> ?constraintType ?constraintSource  . 
 }
}

which correctly bounds ?p to :marriedTo, as wished. 

As a final notation, in many cases we have incomplete or partial temporal records. For instance, we do not know exactly when Leonardo actually painted the Mona Lisa: we only know that it was already painted when he moved to Amboise, in 1516. If we used events, then the creationEvent for Mona Lisa is incomplete and lacking an actual date, which somehow misses the actual point for Events. If we use temporal constraints for a simple triple, this can be made correct and true even in the presence of incomplete information: 

<< :monaLisa dc:creator :leonardo >>
 :startDate "1516"^^xsd:Year.   

The temporal constraint is about the truth of the triple, and not about a creationEvent which I did not use, and is therefore true.  

> But ignoring metaphysics, it is awkward because the 'meta' information changes or modifies the truthvalue of the basic assertion (the plain meaning of the triple), which fucks with the logic.

But why should it fuck with the logic? These triples are NOT TRUE outside their temporal and geographical boundaries. Pretending they are true fucks with the logic. 

> I will let y'all draw your own conclusions about using RDF-star in this way. But please, please, do not think that RDF 1.1 has any kind of temporal indexicality built into its semantics. It doesn't. 

I agree with your comments on indexicality, of course, and will not object to that. 

Ciao

Fabio


--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/
Received on Sunday, 26 December 2021 17:23:45 UTC