Re: multisets everywhere



On Jan 1, 2022, at 4:56 PM, Fabio Vitali <fabio.vitali@unibo.it<mailto:fabio.vitali@unibo.it>> wrote:

Dear Pat,

let me count the things we agree on:

? perhaps we agree. That was exactly my point: that assertions which are time- or place-dependent should include time or place information into their statement, so that they are then no longer so dependent (and are simply true). I think we agree on this, but disagree only on how best to do it.


I agree. I believe, if you read my previous message carefully, that I listed three ways and that was one of them. The above paragraph describes the second way.

I agree that this (indeed any) more complicated way of expressing truth needs to have a certain accepted discipline about it, in order to be usefully deployed in a Web (indeed, any large) setting.

I will entirely agree that Wikidata is a bit of a mess about this issue, and indeed more generally.

* we agree that many (I would say 'most', but let's skip that) assertions are time and space dependent (TSD)
* we agree that binary relations alone cannot correctly represent TSD assertions

Um. Not alone, but binary relations plus some existentials (names of things, if you like) can. And IMO do it better then using n-ary relations. And this is backed up by a lot of work by others more competent than me, with arguments from linguistics (case roles), philosophy and ontological pragmatics.

* we agree that TSD assertions can be represented by n-ary relations either as events or as states

Im not sure I follow this. N-ary relations are one thing, events and states are something altogether different.

* we agree that both events and states have advantages and limits, which are somehow complementary, and that there are no precise guidelines on where to use one or the other.

At risk of opening yet another can of worms (which would not belong on this email list), I would be inclined to say that there is no fundamental difference, myself. At least not in any wat that would influence how to best represent them in assertional syntax.

* we agree that using classes, events, states, or generic hubs (e.g., wikidata Statements) does not change fundamentally the nature of the n-ary relation, nor its advantages and limits.

Again, I am not sure I follow hwat you are saying here. What do you mean by the /nature/ of a relation?

* we agree that n-ary relations are more complicated to handle than binary relations, fundamentally affect the underlying ontology and create reasonably complicated queries and datasets.

I don't think n-ary relations alter ontologies in any deep way. But encoding n-ary (n>2) relations by introducing new entities and relating the arguments to them can change the ontology, yes. (NOt all chjanges are bad, of course :-)


I think this is a healthy bunch of agreements. There are disagreements, of course, but the fundamentals are there.

I also think that the gist of our disagreements is in expecting that all our triples represent "facts" about "reality" (independently of who asserts them) rather than, say, representing "statements" (asserted by someone) that can be expressed either "absolutely" or "subject to boundary conditions".

No, wait. We need to agree on terminology.

Suppose I say to someone, "Felix is hungry."  Then I have /uttered/ a sentence with three words in it, a physical token of a syntactic entity. By making that utterance, I have /asserted/ a proposition which we could represent as 'The cat named 'Felix' at ///pure.hung.skips is hungry at time 19:45 PST on 1.1.2022' where the implicit indexicals have all been filled in with objective time/space coordinate references.

Now, the meaning of the utterance was contextual, ie indexical: it could have had a different meaning if uttered by someone else, somewhere else, at a different time. But the (proposition expressed by) the second sentence is not contextual in this way. Its truth does not depend on who asserts it, or where or when it is asserted. It is simply true; and if it had not been true, it would have been simply false. And this is a standard and general-purpose way of getting to sentences which have a straightforward relationship to truth. It is used (pwerhpas implicitly) by virtually all data formats, and has been used since bookkeeping was invented probably in the ferrtile crescent. Anyone who stamps a date onto a document before filing it is using this device.

Propositions are either true or false. By asserting a proposition, I am claiming it to be true. Not, note, true in a context or true at a time, becasuse we have included all that indexical variation into the proposition itself; but just plain true. The "boundary conditionss", by being mentioned explicitly, have become part of the proposition whose truthvalue is being claimed by the assertion.

Now, this all maps into the semantic web world of RDF, OWL, etc. as follows: the RDF graph is a sentence which is REQUIRED (by the normative semantics) to be true or false, ie to have.a simple truthvalue. Publishing some RDF is asserting it to be true. And all of that is part of the definition of RDF, so if y'all want to use it in any other way, well, it's a free country, but just be aware you are mis-using it.



I guess I have a fundamentally nominalistic perspective in which the temporal constraints in

<<  :RichardBurton :marriedTo :LizTaylor  >>
 :startDate "1963"^^xsd:Year;
 :endDate "1974"^^xsd:Year;

are NOT about the reality (i.e., the marriage between Rick and Liz), but about the knowledge context that makes the statement actually asserted. They are about what we KNOW (and how we decide to represent it), and not about what IS.

Well, I can argue against the utility and indeed accuracy of such a view, and will do so briefly below, but more to the point is that RDF requires that an asserted triple is asserted without a context. Put another way, the truth of the RDF triple

 :RichardBurton :marriedTo :LizTaylor .

in any RDF interpretation does not, and normatively cannot, depend on any assertions made /about/ that triple. So once asserted, it is asserted without any surrounding context or qualification. Which means that your interpretation is inconsistent with the (sorry to repeat) normative semantics of RDF.

Now, one might simply dismiss the published semantics of RDF as irrelevant, of course: but this path comes with some snags. FIrst, it means that the meanings of other constructs, such as OWL or SPARQL, have /their/ semantics defined in conformity with RDF, so you are here rejecting the entire semantic underpinnings of the entire web. Second, it means that you can have no confidence in the results of any RDF (or RDFS, OWL etc.) reasoners when used on your deviant RDF. Maybe their RDF-valid inferences will preserve your intended meanings, maybe not. Until you define a semantics for your RDF-manque and relate it to the normative RDF semantics, there is no way to know.

But to return to your 'nominalist' (?) claim. You are saying that (for example)

:startDate "1963"^^xsd:Year;

is not actually a statement about the date of Dick & Liz's marriage, but rather is about what "we know". Really? About what WHO knows? (To speak of knowledge without a knower is nonsensical.) But in any case, surely this is just wrong. It is a piece of information about the date of a marriage, not about knowledge at all. Of course one could say something like that this assertion represents part of what is 'generally known' about the marriage, but still it is the sentence that is generally known, and what the sentence refers to is a marriage date, not a belief about a date.

Niklas Lindström provided a very interesting list of readings that I am about to comment on next, but the most interesting tidbit in his paper [1] is about nominalism and states of affairs, which wikipedia defines as "a way the actual world must be in order to make some given proposition about the actual world true".

Yes. A pithy way of summing up the idea of Tarskian truth conditions.

In other words, the propositions we are considering should not be viewed as absolutely true or false (or, as you say, simply true or false), but are true when a given state of affairs is obtained.

No, that is not what this says. When Lindström says "make some proposition about the actual world true" he is saying the proposition is ABOUT the state of affairs (which is a real aspect of the actual world), and it IS true – simply true – just when the world does indeed contain (or maybe /exemplify/ or some such locution: philosophers vary here) that state of affairs. Saying that a proposition is simply true does not imply that the world it describes must be simple. So for example, the proposition that Liz and Richard were married from 1963 to 1974 is true – simply true – because the state of affairs it describes is actual, in the actual world. (If it were "true of" a state of affairs, ie if it needed something extra to make it have a simple truth value, then it would be a predicate, not a proposition.)

Facts about the reality are either (simply) true or (simply) false. On the contrary, conditions about a quoted triple allow us to either ASSERT or NOT ASSERT the triple itself.

You assert it by publishing it as RDF data. And you 'not assert' it by not publishing it. That is wired into the semantics of RDF.

The quoted triple is neither true nor false, it is provided for contemplation, so to speak, and becomes asserted under some conditions.

No. We can make assertions about this notorious marriage now, in 2022. We do not (cannot) travel back in time to 1963 in order to make assertions about what happened then. So these dates are not the dates of assertions. (Did you read what I wrote earlier about valid time vs. transaction time?)

These are the conditions that obtain the corresponding state of affairs. Under this conceptual frame, the idea of a true triple is foreign and frankly irrelevant: they are are not true or false (this pertains to reality, or possibly to knowledge), but asserted or non-asserted (this pertains to knowledge representation).

Well (1) this conceptual frame is incompatible with RDF, but also (2) as stated, it is incoherent, or at best muddled. You should try making it formal, and you will soon discover the problems. You will need some variation on the semantics of context logic, by the way.

Why is this interesting?

1) because they do not require you to first adopt new entities such as states or events. You can keep on using simple triples representing trivial binary relationships.

I do not regard events as PSEUDO entities, myself. In fact, the world is pretty much comprised of events, in a suitably broad sense.

Yes, you are probably right. Still, not everybody is using events, not everybody has used events in existing datasets. Should we replace all simple binary relationships with TSD n-ary relationships based on events? More correct, probably. But...

On the other hand, quoting an existing triple is always possible, regardless of the type of conceptual representation that was adopted.

But changing its truthvalue by adding meta-descriptions is not possible. And claiming that it never had a truthvalue until the metadata was added is even more peculiar. So this is not just RDF-as-usual you are talking about.


2) because even with TSD assertions you still CAN use events and states anyway, especially if you feel bad about the poorness of the simple binary model: you can have a model of reality as simple or as complex as you require, and still be able to consider the corresponding propositions as a state of affair that can be obtained under some conditions.

I don't follow how this can happen.

(And they use far more expressive notations, along the lines of full first-order logic with attached metadata, so n-ary relationships and bnodes do not give them nightmares.)


Right. Good. There are totally expressive representations of reality. I am fine with that and I trust you that it could be done and it has been done. But around here there are a lot of much simpler knowledge representations that haven't gone that far in the correctness of the underlying ontology. They are not wrong per se, but just a bit too simple for these fine points. Quoting is an approach that allows us to represent TSD representations without rewriting them all.

But it does not, or at least not without a complete overhaul and rewrite of the RDF semantics, which would likely change the intended meaning of all those simple triples. And has not been done, in any case.

3) because conditions represent minimal requirements and not exact boundaries. In fact, the opposite of ASSERTED is not FALSE, but UNASSERTED.

True, but...
This means that in temporal intervals outside the boundaries the sentence is not false, but simply of unknown truth.

No, it does not mean that. Those dates for the notorious marriage are surely understood to be actual dates of a marriage and its divorce, not vague boundaries or limits. Now, oine can of course use dates in this other way, but that is by no means the norm. And it shoujld be indicated by using different relations than 'startDate' and 'endDate', to keep the meanings clear. In other words, this is a matter of ontology and Krepresentation, not of logic.


<< :monaLisa dc:creator :leonardo >>
:startDate "1516"^^xsd:Year.

The temporal constraint is about the truth of the triple, and not about a creationEvent which I did not use, and is therefore true.

… is it? What does :startDate mean? Suppose we discover that Leonardo actually painted it in 1513. Surely, in that case, this assertion about start date would be /wrong/. But this discovery should be consistent with the facts we currently have (and which your RDF should express).

The quoting triple does not talk about :monaLisa, and does NOT mean that :monaLisa was created in 1516. It talks about the quoted triple, and states this triple is asserted in all states of affairs after 1516. Before that time, the statement is unasserted, not false: before 1516, we have no information about its assertedness. This is correct.

But that is not how 'startDate' is interpreted in other cases, like the marriage. And I suggest it is a very bad idea to impose this open-ended interpretation as a norm. How are you going to represent that actual date, if this is ever determined?


4) because conditions aggregate and do not generate inconsistencies. They may generate unobtainable states of affairs, but not logical inconsistencies. For instance, these quoted triples are all correct:

<< :monaLisa dc:creator :leonardo >>
:startDate "1516"^^xsd:Year.
<< :monaLisa dc:creator :leonardo >>
:startDate "1715"^^xsd:Year.

<< :monaLisa dc:creator :leonardo >>
:startDate "2021"^^xsd:Year.

These are all legitimate statements, compatible with each other and do not generate any inconsistency.

Not in your odd interpretation, no. But nobody else is using that interpretation. it seems to me that they /should/ be mutually inconsistent. This is a bug, not a feature. I dread to think what kind of nonsensical conclusions could be drawn from that last one, eg that Leonardo was  569 years old in 2021.


5) because you can represent in a much simpler way competing and incompatible statements. For instance, as we discussed, Mona Lisa was certainly painted before 1516, but scholars discuss about exactly when: the Louvre Museum believes it was painted in 1506, Alessandro Vezzosi after 1513. If we rely on events, this becomes rather complicated:

:c1 a :creationEvent;
   :work :monaLisa;
   :creator :leonardo;
   :date "1506"^^xsd:Year;
   :source :louvre.

:c2 a :creationEvent;
   :work :monaLisa;
   :creator :leonardo;
   :date "1513"^^xsd:Year;
   :source :vezzosi.

Both creationEvents exist and refer to the same entities. Only the date and the source differ. How can we assert a preference? That is not clear to me. How do we state that the painting surely existed in 1516, a notion that is implicit but never specified because there is no creationEvent listed for 1516? That is not clear to me.

I would prefer to say that there is one event, but two opinions as to its starting date. And to say that, we need a (singe) name to the event. There is no way to directly express such a difference of opinion in any single RDF graph, because it is a genuine divergence about the facts, and RDF can only describe one world at a time, so to speak. I would use a construct like an RDF dataset to encode such divergences of opinion, with separate graphs for each source.


By using RDF* and truth conditions, these assertions becomes easy and explicit.

<< :monaLisa dc:creator :leonardo >>
:startDate "1516"^^xsd:Year.

<<
<< :monaLisa dc:creator :leonardo >>
:startDate "1506"^^xsd:Year
:accordingTo :louvre.


<<
<< :monaLisa dc:creator :leonardo >>
:startDate "1513"^^xsd:Year
:accordingTo :vezzosi.


This represents exactly what we are trying to say, including the safe date of 1516 and the fact that the two hypothesis are contrasting and incompatible.

But are they? If the 'inner' triples are not being aserted, then this graph doesn't say anything about the mona lisa at all. It does not even mention the mona lisa, in fact. But if they are asserted, then yes, we do have a contradiction, regardless of the :accordingTo annotations.

By assuming we trust :louvre, we can extend the assertion back to 1506. By assuming we trust :vezzosi, we can extend the assertion to 1513. If we do not trust anybody, we still have a safe date, 1516, to work with. We could even say something completely out of whack, such as:

<<
<< :monaLisa dc:creator :RoyLichtenstein >>
:startDate "1967"^^xsd:Year.
:accordingTo :fabioVitali.

which is completely ridiculous, but since it is not stated and simply attributed to someone, it is also totally legitimate and NOT inconsistent with the other statements.

But then it also does not say anything about the Mona Lisa or Roy LIchenstein. It only refers to some RDF syntax.



6) because we can easily deal with non-events and non-instantaneous events. A non-event is a definite change in state that is not bound with a specific event. For instance, a politician may be a member of parliament from the event of the election he was voted in, until the non-event of failing to obtain a re-election (thank you to my friend Jörge for mentioning this).

But losing an election is just as much an event (well, a 'state of affairs') as winning one. Non-events are events.

Similarly, a marriage (in its practical, not legal sense) may not start with a priest or a major and a ceremony and a dinner, but, in these modern times, simply when two people start seeing each other more often that others, then progressively exclude anyone else, then start sleeping at each other places more and more frequently, until they start finding stupid paying two rents for two places when basically they have been living together in the one of them for months now, and therefore they let one of the two contracts expire without renewal. What is then the starting event of this relationship? The first sex? The first night sleeping together? The last box moved from the apartment being abandoned? The expiration of the second rental contract? All of them are lowercase correct starting events, none of them is uppercase CORRECT.

So, it's a complicated world with many blurry edges to our categories. True, a fundamental issue for any descriptive framework. One can make similar points about the exact edges of many things. Where exactly does Mount Everest start? How many lakes are there in Finland? If I am in a city and I drive out of it, exactly where do I enter the countryside? When exactly does human life begin? In practice, we impose artificial, legal or adminstrative boundaries precisely to resolve these kinds of issue cleanly, to avoid endless disputes. But as I say, this is a general issue, not something special to events.

These are non-events that are hard to pin down, yet they are definitely the boundary events for a new state. Similarly there are non-instantaneous events: Mona Lisa was definitely started in 1503, then put away for years, maybe worked on in intermittent sprints, until (either in 1506, or maybe much later, around 1513), Leonardo simply forgot to provide any further work on it. There is no instantaneous creation event, maybe the creation is itself a state (the painting first does not exist, then it is in progress, then it is not modified anymore, and the boundary events of these states are not only hard to pin down to a specific date, but probably not really existing as such).

Indeed. There are intermittent events, just as there are non-path-connected territories.


On the other hand, handling correctly non-events and non-instantaneous events are only important for assertions on exact boundaries, not for any assertions away from these boundaries. In any date after 1516, the quoted triple

<< :monaLisa dc:creator :leonardo >>
:startDate "1516"^^xsd:Year.

can be considered asserted regardless of how long it took for Leonardo to declare the painting completed.

7) because we can use the same method to constrain the assertedness of triples along a number of different dimensions: temporal, geographical, provenance, confidence, etc.

We can also make these into properties of the described event, if indeed they are (like the first two) but not when they aren't (like the last two).

For instance, the quoted triple

<<  :GustavIII :positionHeld :monarch  >>
 :startDate "1771"^^xsd:Year;
 :endDate "1792"^^xsd:Year;
 :for :Sweden;
 :accordingTo :wikipedia;
 :confidence "1.0"^^xsd:decimal.

is NOT, as you mention, a "muddle between data and metadata", simply because they are all metadata, and the only data is

:GustavIII :positionHeld :monarch


What these triples mean is that, simply, in order to assert the quoted triple you have to constrain the temporal interval to 1771-1792

Which temporal interval? The one being described, or the one in which the sentence is asserted? Or do you perhaps mean something like a counterfactual, along the lines of, "If it were a time between 1771 and 1792 now, then it would be true to assert…"?

If you mean that the assertion has to include the date information (in order to be a coherent proposition) then we agree, but that is not compatible with what you are saying elsewhere.

, you have to constrain the geographical context

What is a geographical context? Context of what? Of an utterance, or the truth of a sentence, or the location of something in the world? This is just word-salad until you give it some kind of precise flesh in a semantic theory. RDF does not mention contexts anywhere, though Guha did once propose an RDF extension which does.

to :Sweden, you have to trust the source :wikipedia, and you have to accept statements with confidence greater or equal to 1.0. They are not statements about Gustav III, but about the triple  :GustavIII :positionHeld :monarch .

So they are all assertions about a syntactic entity? Not even a proposition? Why then does truth even enter into the discussion?

And by the way, in our earlier example, the different sources diverged not on the base triple but on another of the annotations (the date). That is not compatible with what you are claiming here, that all the meta-triples refer to the base triple, ie in that earlier case the triple

:monaLisa dc:creator :leonardo

which I assume nobody disputes.

Pure nominalism.

Nominalism rejects universals and universal properties. That has nothing to do with what we are debating here.


8) How about repeated occurrences?

But in any case, as others have noted, this does not allow for repeated or interrupted states of affairs, such as the several marriages of of Dick and Liz, or a judge's status while in recusal.

The reason it does not allow for repeated states of affairs is totally due to a specific syntactical choice of RDF* of identifying a quoted triple rather than an occurrence of the triple. The same problem arises with all approaches that do not allow an explicit identifier to their structures, such as quoted graphs in N3. On the contrary, named graphs are, er... named, and thus their name can be used to differentiate repeated states of affairs. With named graphs, this becomes trivial:

GRAPH :m1 {  :RichardBurton :marriedTo :LizTaylor  }
GRAPH :m2 {  :RichardBurton :marriedTo :LizTaylor  }
:m1 :startDate "1963"^^xsd:Year;
 :endDate "1974"^^xsd:Year;
 :accordingTo :wikipedia;
 :confidence 1.0.

That RDF is completely nonsensical.


:m2 :startDate "1974"^^xsd:Year;
 :endDate "1975"^^xsd:Year;
 :accordingTo :wikipedia;
 :confidence 1.0.

:m1 and :m2 may have the same content, but are different graphs and can be identified separately.

True, but they have the exact same truthvalue in any RDF interpretation, and I would expect that to be the case in any future modifcation or extension to RDF. (If not, how can any valid RDF conclusions be drawn from any named graph?) So no amount of annotating them is going to affect what they actually assert. They might have different provenance, of course, but they cannot be true under different circumstances or true ABOUT different things, because they are not "about" anything: they are simply sentences which have a truthvalue and are claimed to be true when asserted.

Unfortunately, of course, the semantics of named graphs prevent us to consider the content of the graph as unasserted.

Named graphs allow this, even if RDF* doesn't. At least, the notion defined in our old paper (which introduced the terminology) does. https://deliverypdf.ssrn.com/delivery.php?ID=348094069068098092123083121126004024054087061054024018026038100122094066068094114118014088008083029062047001096072094067125023064102015014080098075065064102024126001028031070079119115090&EXT=pdf&INDEX=TRUE.

See section 3.1

————

To sum up. RDF assumes, and requires normatively, a simple, basic, 'vanilla' picture of meanings and truth. RDF graphs, and the triples they contain, are simple sentences asserting that a binary relation holds between two entities, all named by URIs (and bnodes and literals, OK). Publishing a graph is understood to be asserting these sentences, ie to be claiming that they are (simply) true. There is no hidden machinery of contexts or tenses or points of view or any other philosophically sophisticated stuff behind the simplicity of RDF. To say that a triple is true is not to say it is true 'now', or true 'here', or probably true, or claimed to be true by some authority; just that is, in fact, true. To repeat, all of this is built into the normative semantics of RDF (and, by the way, of RDFS and OWL and N3), and cannot be rejected without failing to have RDF conformity. Of course, y'all are free to mis-uase RDF, at the risk of being widely misunderstood, and also free to invent new RDF++; but if you do, please specify the semantics with some precision, to avoid having debates like this one.

Pat


This is a limit and a problem.

In fact, my only request to this group is to provide a reasonable smooth extension of the syntax to allow unasserted named graphs in addition to the unasserted individual triples. Then I would be happy and content.

-------------

There are a lot of other topics for discussion from your message, but for now this is enough, I believe. I will be glad to add more reflections (especially about our different perceptions of the confidence scholars have about basic aspects of their fields), but I do not want to oppress too much you and everyone else with my ramblings.

regards

Fabio

[1] https://niklasl.github.io/quid/#



On 29 Dec 2021, at 00:45, Patrick J. Hayes <phayes@ihmc.org<mailto:phayes@ihmc.org>> wrote:



On Dec 26, 2021, at 9:23 AM, Fabio Vitali <fabio.vitali@unibo.it<mailto:fabio.vitali@unibo.it>> wrote:

Hello.

Hi.


My understanding has been that the original conception of RDF
was that it would only be used to record universal and eternal
facts; in other words, everything encoded in RDF was universal
and eternal truth.

I think I know what you mean, but I (and other logicians and philosophers of language) would prefer to say "simply true", or "true simpliciter" if you want to sound fancy. Which means, just true (not, say, necessarily true or mathematically true or scientifically true, etc..) but true without some qualification or modification (possibly true, true now but maybe not tomorrow, somewhat true, conditionally true, etc..) The kind of 'true' that you swear to tell when you take an oath in a court of law.

Differently from logicians, I tend to believe that the simply true facts are not frequent outside of axiomatic domains (and I can just hear Gödel muttering that even in axiomatic universes the issue is far from solved...)

Gödel used the classical notion of truth, but showed it cannot be fully caprtred by any consistent formal system for arithmetic. But he meant 'true' in the simply-true sense.

Basically ANY statement you can express is temporally and geographically bound, up to and including "The Sun rises in the East and sets in the West", which only applies to Earth and to the last several billion years. Simply true facts are the exceptions rather than the norm

Wrong. Most data that has been recorded in just about any permanent medium consists of simple facts. Your bank records, for example. Of course the amount in your bank account changes with time, but it is also recorded with times attached, so that the entire fact has three (probably more) components, and THAT fact - that the balance in Fabio Vitali's account (1) at 3:40am GMT on the 25th of November 2020 (2) was such-and-such (3) - is simply true. That was my point: when one adds the required "contextual" information into the fact so as to make it non-indexical, it becomes a simple fact.

, and any representation that excludes temporal and geographical constraints from statements is justified by simplification, la
ziness or irrelevance, not by logic.

? perhaps we agree. That was exactly my point: that assertions which are time- or place-dependent should include time or place information into their statement, so that they are then no longer so dependent (and are simply true). I think we agree on this, but disagree only on how best to do it.


, but it was hard enough for many to grasp
the simplicity of describing everything with SPO triples that
it took years for many to realize that few descriptions were
eternally accurate.)

In what sense? Yes, if you mean that we can discover that we were wrong and that the historical record may need to be corrected, updated (though this is a pretty rare occurrence for most data). No, if you mean that all assertions are somehow time-dependent in the way that tensed language is.

I disagree. In fact, you are assuming two things in these sentences, and I disagree with both.

2) "No, if you mean that all assertions are somehow time-dependent"

This whole discussion is about accepting that some, or many (or, I believe, most) assertions "are somehow time-dependent". Our usual :a :marriedTo :b is not an absolute fact, and can only be used wrongly if used in a time-independent fashion: for the majority of the past history of the universe, i.e., before their wedding, :a was NOT married to :b, and for the majority of the remaining history of the universe, i.e., from their divorce onwards, they will also NOT be married. Ignoring it generates problems (how do you differentiate bigamy from remarriage [*]?).

Of course, when you are talking about time-dependent relationships or properties, you need to include temporal information, as the proposition is incomplete without it. So 'A is married to B' is not really a proposition: it has (as I tried to explain) a hidden indexical. If you make it into a complete proposition by including the missing time reference, then it becomes something worth trying to assert and record.

BTW, it is a proposition if you understand it to mean "A was married to B at some time" or, indeed, "A is married to B <now>" , provided we replace the indexical '<now>' with an actual time-and-date in some recognized coordinate system for referring to times, but neither of these are expressed by a simple triple, by itself.

This applies to basically everything.

Well no, not to EVERYTHING. There are lots of facts (the atomic weight of lead, arithmetic, much historical data, most contents of current databases of population and personal data, engineering data about materials and fabricated parts, astronomical data such as planetary orbits, commonsense facts such as that water is wet, etc..) that do not vary with time in this way. But to many facts about the human everyday world, yes.


1) "Yes, if you mean that we can discover that we were wrong and that the historical record may ne to be corrected"

You seem to imply that all statements are either right or wrong, and that scholars express everything as true facts until new evidences generates a new truth and the old one is proven wrong. I am afraid that this is not how real scholars work: in most of the fields I deal with, certainties are rare, and it is customary for scholars to express at the same time multiple competing statements as possible, and sometimes advance a personal preference for one of them.

This is muddled. Of course there is extended scholarly debate and, at the edges of our knowledge in any field, disagreements about what is true, expressions of doubt or confidence, debates about evidence and so forth. But all this debate is, ultimately, about what is in fact the case: and that means, which propositions are in fact true. The debates are not (with some exceptions, perhaps, in Continental political theory, theology and quantum physics) about the nature of truth itself, but about what we know to be true. Not only do they not undermine the notion of truth, they depend on it and utilize it.

There is a clear and evident need to be able to express not just the facts that are (momentarily) considered true, but also the competing ones that we still recognise (momentarily) as false, yet possible and/or reported in literature. Preventing the representation of (momentarily) rejected statements distorts and limits the correct expression of what we know about any field of human knowledge.

Sure, but we are here talking about simple factual data, the kind of stuff that appears in almanacs. Not scholarly debate, which would hardly fit into RDF expressivity in any case.

Are you proposing that scholars use RDF to express all this? That would, I suggest, be a mistake. RDF was not designed to support such nuanced debates. (I might add, not a single commentator suggested that RDF should be so designed during the total of six years of activity by the RDF WGs, when public comments were invited.)

I have helped develop KR systems for use by the intelligence community, who also need to handle uncertain and possibly conflicting reports, and deal with highly incomplete data, and also with fake data used with hostile intent, to mislead. And (unlike most scholars) they really do use sophisticated data handling software to help them. But they do also have a robust notion of truth, because trying to determine what actually happened is often the entire point of the exercise. (And they use far more expressive notations, along the lines of full first-order logic with attached metadata, so n-ary relationships and bnodes do not give them nightmares.)


[*] ok, ok. Hardcore catholics don't actually differentiate them...

So, the second idea is, you introduce a single 'thing' (variously called an event, a situation, a circumstance, a happening, a history, a process, a proposition, a fact, depending on who you think invented it) that all these 'arguments' are related to by binary relations (sometimes called facets or aspects or cases, if you come to this from linguistics). For example, this idea was developed for use by military intelligence applications, where the core 'aspects' are the five W's: Who, What, When, Where and Why. This approach gets you some nice side benefits, apart from its flexibility, because these 'things' can have other properties. In the military intellegence case, for example, they can be classified into various categories of interest or relevance to some strategic goal. If we are interested in legal issues, they can be related to whatever regulations they violate, and so on.

An n-ary relationship is just ONE way to represent more complex and time- and location-dependent facts. I think that n-ary relationships have their own share of issues and limits:

I agree. I believe, if you read my previous message carefully, that I listed three ways and that was one of them. The above paragraph describes the second way.


1) You have to invent a pseudo-entity which becomes the hub of many binary relationships, proliferating the number of entities and classes we create exactly because the model is too simple. Just for persons, you must invent birthEvents, deathEvents, weddingEvents, divorceEvents, jobState, schoolingState, etc.

I do not regard events as PSEUDO entities, myself. In fact, the world is pretty much comprised of events, in a suitably broad sense. But in any case, you have to have all these things as binary relations each with preferred collections of attached metadata expressed by other binary relations. It's a mess for both of us.


2) You loose the direct connection between the original subject and the original object. You switch from

:RichardBurton :marriedTo :LizTaylor

to

:m1 a Marriage;
 :groom :RichardBurton;
 :bride :LizTaylor;
 :start "1963"^^xsd:Year;
 :end "1974"^^xsd:Year.

and suddenly there is no more a direct connection between :RichardBurton and :LizTaylor:

SELECT ?p WHERE {
:RichardBurton ?p :LizTaylor .
}

returns empty. Not nice.

Well, true. But if you are expecting that second kind of data format, then you would write a different query

SELECT ?e WHERE {
:RichardBurton ?p ?e .
?e ?q :LizTaylor .
}

And then you might discover even more things about the ways their lives intersected, by the way.
Or, if you were interested in who Maria Callas had married and when, for example, you could query

SELECT ?who ?time WHERE {
:MariaCallas ?p ?e .
?e a :Marriage .
?e ?q ?who .
?e :when ?time .
}



3) You have to decide whether to represent the temporally-bound states (a marriageState, an employmentState, a political term, a life) or the boundary events around them (weddingEvent, divorceEvent, hiringEvent, firingEvent, birthEvent, the deathEvent, etc.) and there are no guidelines and in fact we would often use both with no homogeneity or justification (why do we often use terms or reigns, which are States, for politicians and kings, but use births and deaths, which are Events, for human lives, and should we use weddings/divorces or marriages?)

I agree that this (indeed any) more complicated way of expressing truth needs to have a certain accepted discipline about it, in order to be usefully deployed in a Web (indeed, any large) setting. And that this is a major issue for the semantic web (or whatever the currently accepted term is). But this is going to be an issue however we do it. It seems to me that accepting a basic (forgive the word) ontology of time is the best way to do this. It does not need to be complicated. (There are time intervals and time points. Each interval has, and is uniquely determined by, the points at its ends, called respectively the start or beginning, and the finish or end. The time of an event is either an interval or a point. We could  invent a datatype for these things. It may already have been done. Material things can be treated as intervals (their 'lifetime') so their start is variously called their creation, manufacture or birth. And so on, fairly obvious stuff.)


4) Sometimes we create classes, sometimes we create n-ary relationships, and it is not clear why. For instance, Wikidata uses classes for geographical constraints, and n-ary relationships for temporal constraints: for instance, in order to say that Gustav III was the king of Sweden between 1771 and 1792, https://www.wikidata.org/wiki/Q52930 defines a class "Monarch of Sweden", which is a subclass of "Monarch" limited to the country "Sweden", and creates a Statement (an n-ary relationship) whereby the position held by Gustav III as Monarch of Sweden is limited between the dates 1771 and 1792. Why the invention of the subclass "Monarch of Sweden" when we already had the temporally limited Statement?

I will entirely agree that Wikidata is a bit of a mess about this issue, and indeed more generally. But as to your last question, I could ask it the other way around: why introduce a temporal modification to the statement when there is a an entity already present which would naturally have temporal information attached? No modification needed to RDF, just some more linked data.

BTW, I think I know the answer to both questions: because the data was added by two different people (or by software written by two…) who had different ideas of how to represent temporal qualifications to data. Sigh.


But there is a third idea, which is to treat some smallish subset of the possible arguments as actual arguments, forming a kind of core fact, and the others as meta-assertions ABOUT this core fact. The simplest possible version of this is the core being a single RDF triple, and anything else is about this triple. There are many issues and problems with this approach, though.


First, it is basically wrong: these extra arguments or modifications are not 'about' the triple (unlike, say, provenance information).(They might be 'about' the fact asserted by the triple, but that 'fact' is not the triple itself, which is a syntactic entity. In fact, the thing it is about is probably one of those event-things.)

You are assuming that the n-ary relationship exists in some abstract sense, while the triple does not

No, I am not assuming that. Of course the triple exists.

: that the temporal boundaries can only be be a property of the Marriage, and never a constraint on the truth of the statement.

I am not assuming this, I am suggesting it. And my reason is that most frameworks (logics, databases, RDF graphs, etc.) are built on the assumption that a fully expressed assertion is true, a simple fact, and that any metadata is information ABOUT that record, such as who recorded it, when, where it came from, what evidence there is for it, and so on. But when asserted, it is thereby claimed to be true, and metadata ABOUT it does not change that assertion. (It might alter confidence in it, etc.., but that is a whole other layer of reasoning on top of the basic representation of data.)

I think that the opposite view holds just as well: the triple exist absolutely in an indeterminate state, neither true nor false,, and becomes true within a certain temporal or geographical constraint.

So truth itself is time-and location-dependent? But no, it isn't. That is an illusion created by the way that human language is often used, and probably evolved, to make indexical assertions in a 'presentist' sense, to talk about the immediate circumstance of the utterance: the here and now. Yes, natural languages are like this, which is why they use tenses to speak of the past and future. But data 'languages', the stuff in database tables and Wikidata, are not. Or at any rate, should not be.

If you insist on thinking indexically, and require data to be recorded in notations which reflect natural language to this extent, then you really should give this idea some flesh by writing a more precise (I will not say formal) semantics based on this idea, before trying to impose it on the world of linked data. You will not finish up woth RDF, though, and it will not be trivial to do. To do this you will, at a minimum, need to get very precise about what times and locations actually are. And you will have to get very tricky indeed about how to treat assertions about times and places. (If a temporally 'located' sentence is asserted 'at'a time, but another assertion is made about that time at a different time, what does their conjunction mean? What if something is true at a time but is queried at a time /inside/ that time? Who does the part-of reasoning?) How will you represent the truth-conditions for such things as "X happened last year in Marienbad"? And why stop at limes and locations? Some (many? all?) things that happen have other qualities that may be important. Was it legal? Unusual? Singular? Did it happen in a /manner/ worthy of note? What caused it, and what consequences did it have? Were other agents involved, and if so, how? Etc.. Either these are incorporated into this new logic or not, But if not, how can such other aspects of truth be encoded, if there are no events to predicate them of?)

You said 'the triple', I note, not 'the fact'. But if we insist that each triple expresses a simple fact, so it can be asserted without qualification or decoration (and as required by the RDF semantics, which is, of course, normative…) , then we /must/ introduce these other entities. There is simply no other way to remove all the implicit indexicality. If marriage is time-dependent, then :RichardBurton :marriedTo :LizTaylor . is just a mistake: it is not a complete statement, and cannot be said to be either true or false by itself. But if it gets assered, it is required to be true.

It has nothing to do with the entity Marriage (which does not exists outside of our minds

I profoundly disagree with that claim. Being married has legal consequences, for example.

), but with the condition for the truth of the statement (which are just as arbitrary and abstract as the concept of Marriage).

That does not shock me.

It does not shock me, but I think it is a mis-use of the notion of 'truth'.

Remarkably, you do accept that provenance should be about the truth of the triple and not about the marriage.

No, provenance is about the triple as a syntactic object (actually about the assertion, which might span several triples), not about its truth. It might have consequences for our decisions as to its truth, but it is not directly about that.

So you are fine with three different modes to express various types of annotations about simple binary relationships: converting  it to an n-ary relationship for some annotations, inventing a hierarchy of variously constrained classes for other ones, quoting it for yet other ones. That confuses me. I

I did not say I was fine with them all, just noting that they have all been suggested at various times as ways of dealing with the issue. I do have a preference, which should be fairly clear by now.

On the contrary, truth conditions about the triple provide a uniform model for temporal, geographical, provenance, confidence constraints, that all use a similar pattern to provide truth conditions for a very minimal binary relationship:

<<  :RichardBurton :marriedTo :LizTaylor  >>
 :startDate "1963"^^xsd:Year;
 :endDate "1974"^^xsd:Year;
 :accordingTo :wikipedia;
 :confidence 1.0.

<<  :GustavIII :positionHeld :monarch  >>
 :startDate "1771"^^xsd:Year;
 :endDate "1792"^^xsd:Year;
 :for :Sweden;
 :accordingTo :wikipedia;
 :confidence 1.0.

What does  :X :for :Sweden .  mean? (Suppose :X is a triple about a marriage, for example?) And whatever it meant, was THAT also according to Wikipedia? This illustrates the muddle between data and metadata that I mentioned in my last email.

But in any case, as others have noted, this does not allow for repeated or interrupted states of affairs, such as the several marriages of of Dick and Liz, or a judge's status while in recusal.


No invented pseudo-entities, no need to choose between states (Marriage) and events (Weddings), no difference of approach between different types of annotations, a direct connection still exists between your original subject and your original object and can be queried in a reasonably simple manner:

SELECT ?p WHERE {
{
:RichardBurton ?p :LizTaylor .
}
UNION
{
<< :RichardBurton ?p :LizTaylor>> ?constraintType ?constraintSource  .
}
}

which correctly bounds ?p to :marriedTo, as wished.

But suppose that the querier is intending to ask, are they married NOW? Will they assume that the simple triple means that they are? (What else could it mean?)


As a final notation, in many cases we have incomplete or partial temporal records. For instance, we do not know exactly when Leonardo actually painted the Mona Lisa: we only know that it was already painted when he moved to Amboise, in 1516. If we used events, then the creationEvent for Mona Lisa is incomplete and lacking an actual date, which somehow misses the actual point for Events.

Not at all. Of course we may have partial information about anything, indeed this is the normal case. That does not make something "incomplete". But…

If we use temporal constraints for a simple triple, this can be made correct and true even in the presence of incomplete information:

<< :monaLisa dc:creator :leonardo >>
:startDate "1516"^^xsd:Year.

The temporal constraint is about the truth of the triple, and not about a creationEvent which I did not use, and is therefore true.

… is it? What does :startDate mean? Suppose we discover that Leonardo actually painted it in 1513. Surely, in that case, this assertion about start date would be /wrong/. But this discovery should be consistent with the facts we currently have (and which your RDF should express). The right way to say this is, the start date (of the triple's being true, or of the start time of the life of the painting) is some date /earlier/ than 1516, which requires a bnode and some explicit temporal relations. As often with issues of describing time, simple tricks (usually trying to use a non-tensed language as though it was a tensed, presentist language) just don't work. You have to get it right.

But ignoring metaphysics, it is awkward because the 'meta' information changes or modifies the truthvalue of the basic assertion (the plain meaning of the triple), which fucks with the logic.

But why should it fuck with the logic? These triples are NOT TRUE outside their temporal and geographical boundaries. Pretending they are true fucks with the logic.

Because the semantics of truth they are required to conform to (and which is used by virtually all reasoners ever created, not just for RDF) does not recognize this kind of 'indexed truth' where a sentence is true in some times/places and false in others. And it is not simple to invent a semantic theory and/or a reasoning system based on it which does.

I have actually tried this, by the way, so I speak from experience. See – but please do not cite – https://www.ihmc.us/users/phayes/Trickledown2004.pdf


Pat


I will let y'all draw your own conclusions about using RDF-star in this way. But please, please, do not think that RDF 1.1 has any kind of temporal indexicality built into its semantics. It doesn't.

I agree with your comments on indexicality, of course, and will not object to that.

Ciao

Fabio


--

Fabio Vitali                            Tiger got to hunt, bird got to fly,
Dept. of Computer Science        Man got to sit and wonder "Why, why, why?'
Univ. of Bologna  ITALY               Tiger got to sleep, bird got to land,
phone:  +39 051 2094872              Man got to tell himself he understand.
e-mail: fabio@cs.unibo.it<mailto:fabio@cs.unibo.it>         Kurt Vonnegut (1922-2007), "Cat's cradle"
http://vitali.web.cs.unibo.it/

Received on Monday, 3 January 2022 10:56:52 UTC