Re: multisets everywhere from Anthony Moretti on 2021-12-26 (public-rdf-star@w3.org from December 2021)

From: Anthony Moretti <anthony.moretti@gmail.com>
Date: Mon, 27 Dec 2021 00:12:49 +1030
To: "Patrick J. Hayes" <phayes@ihmc.org>
Cc: Ted Thibodeau Jr <tthibodeau@openlinksw.com>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-ID: <CACusdfS0b7_H7bf1aWDJXuD0yWBiEx4FH=ank7rx9NtrELzTTQ@mail.gmail.com>
Earlier Fabio wrote:
>
> In particular, expressing statements as non-absolute (neither true nor
> false, but subject to external constraints such a temporal and/or
> geographical data) is extremely important, and extremely liberating. I like
> RDF* star exactly for this. I wish it was possible to do the same for named
> graphs.


If you think of graphs as just another type of statement, a compound
statement, the extra-positions idea could be applied in the same way at the
graph level:

*Simple statement:*
Subject Relation Object T1 T2 SpatialBound Certainty

*Compound statement:*
{
    Statement1 AND
    Statement2 AND
    Statement3
}.
    T1 T2 SpatialBound Certainty

For any graph to be valid the temporal and spatial bounds for its
constituent statements would need to be within the bounds of the compound
statement.

Earlier Pat wrote:

> So, the second idea is, you introduce a single 'thing' (variously called
> an event, a situation, a circumstance, a happening, a history, a process, a
> proposition, a fact, depending on who you think invented it) that all these
> 'arguments' are related to by binary relations (sometimes called facets or
> aspects or cases, if you come to this from linguistics). For example, this
> idea was developed for use by military intelligence applications, where the
> core 'aspects' are the five W's: Who, What, When, Where and Why. This
> approach gets you some nice side benefits, apart from its flexibility,
> because these 'things' can have other properties. In the military
> intellegence case, for example, they can be classified into various
> categories of interest or relevance to some strategic goal. If we are
> interested in legal issues, they can be related to whatever regulations
> they violate, and so on.
>
> To make the discussion more muddled, the new-thing-plus-binary-links trick
> is also widely known as a way to reduce n-ary relations to combinations of
> binary relations, so some people think of the second way as 'really' being
> the same as the first way, just in a different notation.
>
>
> But there is a third idea, which is to treat some smallish subset of the
> possible arguments as actual arguments, forming a kind of core fact, and
> the others as meta-assertions ABOUT this core fact. The simplest possible
> version of this is the core being a single RDF triple, and anything else is
> about this triple. There are many issues and problems with this approach,
> though. First, it is basically wrong: these extra arguments or
> modifications are not 'about' the triple (unlike, say, provenance
> information).(They might be 'about' the fact asserted by the triple, but
> that 'fact' is not the triple itself, which is a syntactic entity. In fact,
> the thing it is about is probably one of those event-things.) But ignoring
> metaphysics, it is awkward because the 'meta' information changes or
> modifies the truthvalue of the basic assertion (the plain meaning of the
> triple), which fucks with the logic.
>

Hi Pat. I agree with basically your entire email. On the point above I've
thought about this previously and I feel like the second and third ideas
are actually the same, it's just that in the second idea the "core fact" is
implicit. IMO the implicit "core fact" in the second idea is a bounding
event that could be used as an explicit "core fact" in the third idea and
then both ideas would be saying the same thing.

And as a practical matter, it makes it hard to keep data organized, by
> muddling up different kinds of information. Temporal database theory, for
> example, makes a sharp distinction between valid time (the time of the
> facts being true or of the event happening) and transaction time (the time
> when the data was entered or written), and has invented an entire
> methodology of not getting these muddled. But treating valid time as a meta
> assertion about the data is exactly this muddle.
>

I feel like the separation of referentially transparent and referentially
opaque fragments is a solution to this, so something like:

:RichardB :marriedTo :LizT
  [
    :startTime 1964
    :endtime 1974
  ]
  [
    :statedBy :Bob
    :statedIn :Wikipedia
    :recorded "2021-07-07"^^xsd:date
  ]

1964–1974 is the valid time, 2021-07-07 is the transaction time, and
they're in different parts of the statement.

And if time and space are given special treatment:

:RichardB :marriedTo :LizT 1964 1974
  []
  [
    :statedBy :Bob
    :statedIn :Wikipedia
    :recorded "2021-07-07"^^xsd:date
  ]

One is to treat 'marriedTo' (etc.) as having more relational arguments, so
> the time-period it is asserted 'about' is part of the relationship. One
> problem with this was noted early on: how many arguments do you need?
> Consider(not my example): it happened  in the kitchen, after midnight, on
> the ides of March, John did it, with a knife, quickly, with passion…where
> do you stop needing to add extra arguments? (He made a cheese sandwich, by
> the way.)
>

Theoretically you don't need to add more arguments, my argument is just a
pragmatic one based on our existence in time and space.

Regards
Anthony


On Sun, Dec 26, 2021 at 6:16 PM Patrick J. Hayes <phayes@ihmc.org> wrote:

> I have been trying to not get involved in this discussion, but some things
> just have to be corrected.
>
> On Dec 23, 2021, at 9:58 AM, Ted Thibodeau Jr <tthibodeau@openlinksw.com>
> wrote:
>
> On Dec 21, 2021, at 03:23 PM, Pierre-Antoine Champin <
> pierre-antoine.champin@ercim.eu> wrote:
>
>
> In RDF semantics (both the current standard and the proposed RDF-star), a
> triple is either true or false.
>
>
> Right. The semantics treats each triple S P O . as an atomic statement,
> with the logical form P(S, O), ie it asserts that the relationship P holds
> between the two things S and O. No mention of times or contexts or any way
> that truth is modified or made temporary.
>
>
> I believe this is the first time I've known anyone to suggest
> that an RDF triple could be (semantically known to be) false.
>
>
> Of course triples CAN be known to be false. In fact the RDF 1.1 semantics
> explicitly requires some triples to be false, for example
>
> :S :P. "notAnInteger"^^xsd:integer .
>
> is false when XSD Integer is a recognized datatype. (
> https://www.w3.org/TR/rdf11-mt/#D_interpretations)
>
>
> How do you know whether a given triple is false?  Or, true?
>
>
> You know that when asserted (ie published as part of some data) then it is
> being CLAIMED to be true. That is what 'asserted' means.
>
>
> My understanding has been that the original conception of RDF
> was that it would only be used to record universal and eternal
> facts; in other words, everything encoded in RDF was universal
> and eternal truth.
>
>
> I think I know what you mean, but I (and other logicians and philosophers
> of language) would prefer to say "simply true", or "true *simpliciter*"
> if you want to sound fancy. Which means, just true (not, say, necessarily
> true or mathematically true or scientifically true, etc..) but true without
> some qualification or modification (possibly true, true now but maybe not
> tomorrow, somewhat true, conditionally true, etc..) The kind of 'true' that
> you swear to tell when you take an oath in a court of law.
>
>
> (This was an immediate problem, because we all hopefully know
> that description accuracy requires that those descriptions be
> changeable over time
>
>
> No, we did (and do) not know that. I do not believe this to be the case.
>
> , but it was hard enough for many to grasp
> the simplicity of describing everything with SPO triples that
> it took years for many to realize that few descriptions were
> eternally accurate.)
>
>
> In what sense? Yes, if you mean that we can discover that we were wrong
> and that the historical record may need to be corrected, updated (though
> this is a pretty rare occurrence for most data). No, if you mean that all
> assertions are somehow time-dependent in the way that tensed language is.
>
>
> On this basis, even though RDF officially and explicitly operates
> under the "Open World" assumption (where anything that is not
> stated is implied and should be inferred to be unknown)
>
>
> Not INFERRED to be unknown. Juat not known, so no inferences should be
> drawn from such lack of information. But yes, this is an ideal that is
> often ignored in practice.
>
> , *some*
> unasserted values were in practice treated as if they had been
> asserted -- i.e., that once inscribed, a triple was now, had
> always been, and would always be, accurate.
>
>
> What triple? You are muddling two issues here: the timeless quality
> claimed for RDF assertion, and the open world assumption. These are not the
> same issue.
>
> Operating on this universal and eternal truth assumption, all
> graphs in the universe could be combined, and there would be no
> contradictions, and all queries should deliver results that are
> likewise universally and eternally true.
>
>
> Well, as true as the assertions were when they were made. RDF, like any
> logic, cannot guarantee the truth of what it is given as input. It can
> however guarantee validity, ie that it does not itself insert falsity into
> inferences.
>
> This belief has been problematic since RDF began, and it is
> likely to continue to be so for many years if not forever.
>
> In RDF 1.1, it was explicitly stated that any given graph must
> be treated as a snapshot of a universe, just a moment in time
>
>
> NO!. I have no idea where you got this idea from, but it is completely and
> absolutely WRONG. There is no such notion of a 'snapshot' anywhere in RDF.
>
> (though still treated as if entirely true about that moment),
> and should only be blended (merged, unionized) with other graphs
> that described the same moment in time.
>
>
> The semantics does not support the idea of a graph describing a "moment in
> time".
>
> The only way to *know* whether any two Named Graphs were about
> the same moment in time is for those two Named Graphs to be
> explicitly described as such.  Often enough, even with this
> improvement, two observers who inscribed descriptions that
> were accurate from their perspective, included to few details
> about what made up their perspective for others to accurately
> determine which graphs were from that same perspective, and
> which were different.  (Just for discussion's sake, consider
> two people, one to the north and one to the south of a fire,
> describing that fire.  The wind was blowing west-to-east, so
> smoke could accurately be described as drifting east -- but
> the observers described it instead as drifting to the right
> in one case and to the left in the other -- and both were
> indeed accurate, but neither was *fully* accurate….)
>
>
> Good example. 'Right' and 'left' (at least used geographically), like
> 'now' and 'here', 'me' and 'you', are indexicals: their meaning depends on
> their context of use. Putting indexicals into data that is intended to be
> transmitted to another place, or stored for later re-use - in fact, into
> pretty much any data -  is a BAD IDEA. This has nothing particularly to do
> with RDF and triples: it is just a basic rule about how to record
> information so other people can use it. If you call 911 and they ask you
> where you are, it is unhelpful to say "here", though it is of course true.
>
> Unfortunately, apparently simple facts about the world can often have an
> implicit indexical (usually 'now', sometimes some form of 'here')
> incorporated into them by accident, as it were. As much discussed on thie
> thread, any two-place relation (like 'is married to') which can change with
> time (or location) is in fact not the simple two-place relation it appears
> to be, so it should be encoded as something more complicated when properly
> used to store real-world data. All of this is kind of data engineering 101
> and has been known and discussed for at least a century.
>
> All of which is to say, "This is far more complex than it
> appears when we say 'S P O [G]' is all you need to describe
> anything!"
>
>
> What has also been known since before the Internet (actually since around
> 1890) is that these more complicated things that must be used to encode
> data can always be built up from a suitably woven graph of binary
> relational assertions, ie triples. So RDF is universal, in a sense, but not
> in a trival sense.
>
> The ways of doing it are also well-known.
>
> One is to treat 'marriedTo' (etc.) as having more relational arguments, so
> the time-period it is asserted 'about' is part of the relationship. One
> problem with this was noted early on: how many arguments do you need?
> Consider(not my example): it happened in the kitchen, after midnight, on
> the ides of March, John did it, with a knife, quickly, with passion…where
> do you stop needing to add extra arguments? (He made a cheese sandwich, by
> the way.)
>
> So, the second idea is, you introduce a single 'thing' (variously called
> an event, a situation, a circumstance, a happening, a history, a process, a
> proposition, a fact, depending on who you think invented it) that all these
> 'arguments' are related to by binary relations (sometimes called facets or
> aspects or cases, if you come to this from linguistics). For example, this
> idea was developed for use by military intelligence applications, where the
> core 'aspects' are the five W's: Who, What, When, Where and Why. This
> approach gets you some nice side benefits, apart from its flexibility,
> because these 'things' can have other properties. In the military
> intellegence case, for example, they can be classified into various
> categories of interest or relevance to some strategic goal. If we are
> interested in legal issues, they can be related to whatever regulations
> they violate, and so on.
>
> To make the discussion more muddled, the new-thing-plus-binary-links trick
> is also widely known as a way to reduce n-ary relations to combinations of
> binary relations, so some people think of the second way as 'really' being
> the same as the first way, just in a different notation.
>
>
> But there is a third idea, which is to treat some smallish subset of the
> possible arguments as actual arguments, forming a kind of core fact, and
> the others as meta-assertions ABOUT this core fact. The simplest possible
> version of this is the core being a single RDF triple, and anything else is
> about this triple. There are many issues and problems with this approach,
> though. First, it is basically wrong: these extra arguments or
> modifications are not 'about' the triple (unlike, say, provenance
> information).(They might be 'about' the fact asserted by the triple, but
> that 'fact' is not the triple itself, which is a syntactic entity. In fact,
> the thing it is about is probably one of those event-things.) But ignoring
> metaphysics, it is awkward because the 'meta' information changes or
> modifies the truthvalue of the basic assertion (the plain meaning of the
> triple), which fucks with the logic. And as a practical matter, it makes it
> hard to keep data organized, by muddling up different kinds of information.
> Temporal database theory, for example, makes a sharp distinction between
> valid time (the time of the facts being true or of the event happening) and
> transaction time (the time when the data was entered or written), and has
> invented an entire methodology of not getting these muddled. But treating
> valid time as a meta assertion about the data is exactly this muddle.
>
> I will let y'all draw your own conclusions about using RDF-star in this
> way. But please, please, do not think that RDF 1.1 has any kind of temporal
> indexicality built into its semantics. It doesn't.
>
> Pat Hayes
>
>
> Be seeing you,
>
> Ted
>
>
>
>
>
Received on Sunday, 26 December 2021 13:43:17 UTC