Re: multisets everywhere from thomas lörtsch on 2021-12-21 (public-rdf-star@w3.org from December 2021)

From: thomas lörtsch <tl@rat.io>
Date: Tue, 21 Dec 2021 17:53:54 +0100
To: Fabio Vitali <fabio.vitali@unibo.it>
Cc: Anthony Moretti <anthony.moretti@gmail.com>, Laufer <carlos.laufer@gmail.com>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-Id: <B01C79D9-74B0-4230-9511-D0FEE8CD77AF@rat.io>
Hi Fabio,

I wonder if your interpretation of quoted triples as, in your words, "non-absolute statements" is in line with the community report. Dörthe has expressed a similar view if I understood her right (but so far she hasn’t replied to my request for clarification). I always understood quoted triples as non-asserted, period. On the other hand I understood annotated statements as valid in accordance with their annotations. 

This is a problem with more than one dimension:
- between asserted and not asserted there are degrees of "asserted under certain conditions", like e.g. the assertion of the marriage between Taylor and Burton only being valid in certain periods
- annotations may be understood as constraints or as additional information, like e.g. it might be my understanding that in life nothing is necessarily forever, including marriages. Start and end dates under such a precondition are not a constraint but simply supplementary detail.

In many ways this seems to me like the problem of deciding if a glass of water is half-full or half-empty. It depends on expectations. 

The general assumption under which the semantic web operates is that information is not complete. One always has to account for the possibility of additional detail changing the meaning of what one knows already. I might have thought Taylor and Burton are still married, suddenly an end date surfaces. Next thing is that I realize that they are both dead already. I might in another case even oversee that I’m dealing with imagnary persons from a novel. Never can I assume full knowledge, nor full detail. And never should I be forced to change the way I modelled something because of new knowledge. That is actually more an argument meant for Dörthe than for you but hopefully it helps clarify my view on this whole topic.

The modelling that you propose makes it harder to realize another goal, the one that I thought fuels the demand for "unasserted assertions": it makes it harder to state something that we don’t endorse. For example a few decades ago it was still not uncommon in Germany to say that "Hitler wasn’t all bad as he had for example built the Autobahn". How do I model that in RDF? Under no circumstance do I want to have a statement saying "Hitler wasn’t all bad" in my triple store. If quoted triples are strictly unasserted, then this is easy:

    << :Hitler :not :AllBad >> :because :Autobahn .

I might go even further, like:

   << << :Hitler :not :AllBad >> :because :Autobahn >> 
       :accordingTo :SomeEternallyYesterday .             [0]

The thing is: I don’t want the central statement - "Hitler not all bad" - ever to pop up in some unassuming query! How can I do that if, as in your interpretation, quoted statements are not unasserted but asserted under the condition of their annotation?

I think that outside of such extreme examples there is a gradual shift between asserted and unasserted: assertions may be conditionalized explicitly through annotations but also unexplicitly through context or through additional statements or through exchanging one node in the triple by a blank node with its own annotations. But the general direction of the semantic web surely is that we offer and ingest data that we believe in, that we find useful, that we want to operate on. So the default assumption is: "this is true (hopefully) (check the small print)". I think that this would also be a useful default assumption for annotated statements. The statement

    :RichardB :marriedTo :ElizabethT .

says that there exists a :marriedTo relation between those two persons - nothing more, nothing less - and that is indeed true. That relation exists. History is real too. It has some properties, among them that it isn’t in effect today. That statement would be false if the relation had never existed, everything else is fair game. Additional detail

    :RichardB :marriedTo :ElizabethT .
    <<:RichardB :marriedTo :ElizabethT>> :start 1966 .

is of course always welcome :-)

Your style of modelling however says: per default no information can be trusted, we need to know more. Where does that stop? Why are the annotations not quoted themselves? Why is 

    :dihydrogen-monoxide rdfs:label "ice" 

a non-absolute statement, only conditionally asserted?

How do you _not_ say something?

And how is this compatible with all the data out there already? Do you expect everybody to transform their statements into quoted statements?


W.r.t. graphs: Pierre-Antoine modelled graphs as lists of quoted statements. OTOH: what holds you back to define your own named graph semantics, annotate your named graphs accordingly and be done with it?


Best,
Thomas


[0] That’s actually hard to translate. In german we say "ewig Gestrige". If :eternallyYesterday doesn’t make sense then just replace it with :idiots.



> Am 21.12.2021 um 11:12 schrieb Fabio Vitali <fabio.vitali@unibo.it>:
> 
> Hello, 
> 
>> - Start time (assumption if blank: unbounded)
>> - End time (assumption if blank: unbounded)
>> - Location (assumption if blank: unbounded)
>> - Certainty (assumption if blank: 1.0)
>> 
> 
> In my research team we call them contexts, i.e., conditions that make true a non-absolute statement. We have identified at least seven contexts: 
> 
> - Temporal context (a temporal interval within which the statement is true);
> - Spatial context (or, better, jurisdictional context, which allows us to distinguish between, say, the Roman Empire, the Church State, the Italian Kingdom and the current Italian Republic, all of which share at least in part the same location;  
> - Part-whole context (e..g. when recording facts about individual pages of a ancient manuscript, then creation date, author and ownership apply to the whole book, and not individual pages);
> - Object-subject context: e.g. when recording facts about a depiction, i.e. a painting or a photograph, being able to distinguish facts about the painting vs. about the subject of the painting (pretty tricky when you have a painting of a painting, or even a photograph of a painting of a painting, ecc.);
> - Provenance context (when you have competing and reciprocally incompatible statements from different sources);
> - Confidence context (when you yourself are considering different and reciprocally incompatible statements with different degrees of confidence about their truth);
> - Physical context (wee later for an example).
> 
> All these contexts are used to create assertions that have the non-absolute statement as subject, and express conditions for their truth. Thus for instance: 
> 
> <<:napoleon :role :emperor>>
>  temporal:start "1804-05-18"^^xsd:Date;
>  temporal:end "1814-04-06"^^xsd:Date;
>  jurisdiction:country :FirstFrenchEmpire; 
>  confidence:confidence "1.0". 
> 
> << :dihydrogen-monoxide :form :solid >>
>  physical:highTemp :0Centigrade. 
> 
> << :dihydrogen-monoxide rdfs:label "ice" >>
>  physical:highTemp :0Centigrade. 
> 
> << :dihydrogen-monoxide :form :liquid >>
>  physical:lowTemp  :0Centigrade; 
>  physical:highTemp :100Centigrade;
>  physical:pressure :1atm. 
> 
> << :dihydrogen-monoxide rdfs:label "water" >>
>  physical:lowTemp  :0Centigrade; 
>  physical:highTemp :100Centigrade; 
>  physical:pressure :1atm. 
> 
> << :dihydrogen-monoxide :form :gas >>
>  physical:lowTemp  :100Centigrade. 
> 
> << :dihydrogen-monoxide rdfs:label "steam" >>
>  physical:lowTemp  :100Centigrade.  
> 
> I find this approach much cleaner and easier to explain to domain experts than requiring them to create an instance of an n-ary relationship relying on some abstract concept, or to invent a new OWL class which is a subclass of some other class, etc. The list can be further and easily expanded to other contexts, if and when we find out we need them.
> 
> Having rdf-star statements available is extremely important because it allows to clearly separate non-absolute statements (that are only true within a given context) from absolute statements (that do not need contexts to be true). And for this purpose, rdf-star is simply perfect: rdf-star triples are non-absolute statements, and plain RDF triples are absolute statements. 
> 
> My only problem, as you can see, is that sometimes we need to collect multiple individual statements and associate them to the same context. 
> 
> For instance, I want to associate both the form :liquid and the label "water" for the compound :dihydrogen-monoxyde to the conditions "physical temperature between 0 and 100 Centigrades and pressure 1 atmosphere". Right now I had to duplicate the conditions to each of the two non-absolute statements. 
> 
> I wish there was a construct in RDF that acts sort of like a... like a container of individual triples! This container could then become the subject of our contexts. Ideally such container would allow us to distinguish between non-absolute statements (that are only true within a given context) from absolute statements (that do not need contexts to be true). 
> 
> Oh wait: but one such structure exists in RDF 1.1, it is called named graph, and it provides everything that I need except the distinction between non-absolute and absolute statements! 
> 
> GRAPH :ice {
>  :dihydrogen-monoxide :form :solid .
>  :dihydrogen-monoxide rdfs:label "ice" .
> }
> 
> GRAPH :water {
>  :dihydrogen-monoxide :form :liquid.
>  :dihydrogen-monoxide rdfs:label "water".
> }
> 
> GRAPH :steam {
>  :dihydrogen-monoxide :form :gas.
>  :dihydrogen-monoxide rdfs:label "steam".
> }
> 
> :ice   physical:highTemp :0Centigrade. 
> :water physical:lowTemp  :0Centigrade; 
> :water physical:highTemp :100Centigrade; 
> :water physical:pressure :1atm. 
> :steam physical:lowTemp  :100Centigrade. 
> 
> The problem is that named graphs give me no distinction between containers of absolute statements and containers of non.absolute ones. How I wish there was a symmetry between individual triples (rdf-star vs. rdf triples) and named graphs...
> 
> Ciao
> 
> Fabio
> 
> --
> 
>> More generally, basic temporal logic says that the bounds on any event are the bounds for its subevents and the subevents can be explicitly bounded further. If statements represent relationships and relationships are events then the statement is a subevent of the existence event of both the subject and object, therefore any statement can leave those positions blank but still have bounds, temporal and spatial.
>> 
>> 
>>> The positions can be left blank if
>>> current assumptions are maintained so that would probably mean most
>>> statements can be left untouched, and if the assumptions are different for
>>> the entire graph they could be stated at the graph level.
>> 
>> I don't understand. If they can be left blank and consequently not asserted, how are they defaults? 
>> 
>> Any reasoner would assume "unbounded" if no values are provided.
>> 
>> 
>>> Better to start from a principled approach and then see how hard it has to
>>>> be tweaked to arrive at a practical solution, accomodate corner cases etc.
>>>> 
>>> 
>>> Feel like that's what I'm doing, haha.
>> 
>> If you propose to solve a problem that I describe as a very general one by some examples of seemingly common cases you narrow the scope. That narrowing has to be well understood. Maybe my perspective clouds my judgement but my feeling is that your proposal narrows the scope in quite ad hoc ways that might solve the problem for some special cases (and even there I have my doubts as mentioned above) but leaves a lot or most of them (even equally general ones like authorship) unresolved. 
>> 
>> If I'm understanding you correctly, I agree that a referentially opaque relation such as "statementOf" is still needed for provenance use cases etc., is that what you mean when you're talking about authorship? But the need for a referentially transparent relation, and the subsequent confusion that ensues, would be greatly reduced if statements could have start and end time positions.
>> 
>> It also addresses the multiset problem because statements with the same subject, object and relation but different start and end times etc. are different statements.
>> 
>> Regards
>> Anthony
>> 
> 
>
Received on Tuesday, 21 December 2021 16:54:14 UTC