Re: summary un/asserted from Niklas Lindström on 2024-07-10 (public-rdf-star-wg@w3.org from July 2024)

From: Niklas Lindström <lindstream@gmail.com>
Date: Wed, 10 Jul 2024 21:15:18 +0200
To: Thomas Lörtsch <tl@rat.io>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Message-ID: <CADjV5jcOHcoXmN4HWRY7kweNcha_op1J1B8qF_+KkNVJPwKzkQ@mail.gmail.com>
On Wed, Jul 10, 2024 at 2:56 PM Thomas Lörtsch <tl@rat.io> wrote:
>
>
>
> > On 9. Jul 2024, at 20:59, Niklas Lindström <lindstream@gmail.com> wrote:
> >
> > Hi Thomas,
> >
> > First, I agree with everything Peter wrote in his (first) reply. (Though, regarding names and definitions, I might put it as: triples as objects (triple terms) simply denote relationships (so we can refer to them directly), whereas triples in graphs simply assert that they hold.)
> >
> > Again, you talk about "unasserted assertions", as if referring to a triple (term) is somehow affected if that is also asserted (or is not).
>
> That is the way the current proposal implements "unasserted assertions". Ask around in the WG if we support "unasserted assertions" and you will hear from many participants that "yes, we do", because the triple term occurrence, e.g. '<< :id | :s :p :o >>’, is not asserted in the graph, i.e. does not entail the triple ':s :p :o'.

I'm really just noting that, while calling triple terms that happen to
be absent from the graph "unasserted" is informally fine, calling
assertions such as `_:x rdf:reifies <<(:s :p :o)>>` "unasserted
assertions" sounds contradictory.  It becomes hard to determine what
the "unasserted assertion" is: the triple using the triple term or the
triple term itself? If it is the triple term, is it "otherwise
asserted" if that triple is in the graph?

> > There is no such interaction a priori, so I don't see the problem with the reifier in "1/4". It appears totally abstract and does not relate to any notion of truth; it is not "about an unasserted assertion". It is simply relating to what `:s :p :o` means, i.e. the abstract relationship.
>
> So in your opinion we do not provide any support for unasserted assertions whatsoever? That would be a problem too, since CG and WG discussions about this topic always concluded in a desire/need to support such a construct.

I'm not opposed to what we have arrived at with triple terms as
objects. If a triple is asserted, i.e. in the graph, and it has a
triple term, i.e. another triple in the object position, that does not
imply the assertion of the latter triple. It may or may not also be
asserted in the graph. You might call that an unasserted triple if it
is actually not asserted in the graph, but that is an observation of
it in relation to the entire graph.

> Maybe the concept needs more definitional refinement, but I think the kinda freewheeling notion of "a statement that is described via its constituents subject, predicate and object, but not contained as a triple in the graph" is good enough (THE graph because we always speak only about one graph, not a combination thereof which is largely undefined in RDF)

I'm fairly sure I agree with the gist of what you say. But more
precision is needed for a spec. I'd call that a triple term. (Here you
call the triple a statement, but from the specs, "An RDF statement
expresses a simple logical relationship", so we're in the
interpretation with that wording. It's a delicate balance (and I've
erred here on occasion, for which I am sorry).)


> > If what you want is an opaque context, just use a literal with an RDF serialization.
>
> Referential transparency or opacity is an orthogonal issue and I’ve been careful to mention it only as such, if at all, in the discussion about unaserted assertions.

OK, good. It just seemed that you wanted to isolate the reified
relationship from its possible validity (holding in the world) when
you wrote: "we might want to document different viewpoints or
versions, graphs might be merged or updated". If you didn't, I think
we're on the same page here. And I just saw your reply to Enrico that
you don't think that `_:x rdf12:reifies <<(:s :p: :o)>>` causes an
inconsistency in the graph if `:s :p :o` is also asserted. So I think
we're converging on meaning.


> > If it is about an entity in your domain which reifies falsehoods (like a Myth), it would itself be proven at odds with a world in which those relationships actually hold. That's pretty much the point of referring to relationships, so that you then have something to check if it's true or not. (And monotonicity is not violated here.)
>
> This is very abstract and you would have to show me how that relates to practical uses. I’m arguing about the brittleness of a very specific construct and some metaphysical considerations don’t take away from that brittleness.

It's about a difference such as between a :MarriageCertificate and a
:Marriage, both reifying an abstract :marriedTo relationship. Do you
consider one of them reifying and the other instantiating
(qualifying/mediating/implying)? Is the absence of this distinction in
rdf:reifies brittle?


> > Still, I may understand parts of your discomfort, in *some* cases.
>
> Of course only _some_ cases: saying that a construct is brittle and inherently ambiguous means exactly that it works _sometimes_, but not always. Otherwise it were just broken. But working sometimes is sometimes an even more dangerous arrangement. We might contend ourselves with warning users about the degree to which the current proposal supports "unasserted assertions", and where it doesn’t. Not easy, but possible. I wonder however if that is good enough, and if the more expressive approach that I propose doesn’t reap enough additional benefit, especially w.r.t. qualified relations, that it is worth the effort.

That is worth evaluating. I tried to with my example, but we can check
more cases and intuitions thereof.

> > I think you might sense a distinction in "nature" between kinds of reifiers which mediate what needs to be true for the reifier to "make sense" (such as a Purchase), and others which *only* refer to relationships, true or not (such as a Claim).
>
> No. It’s much less complicated: some statements I just don’t want to have as facts in the data, but I still want to annotate them. But I also want to be sure that adding that fact (because of/by some other reason, or circumstance, or context, or user) doesn’t overwrite the implication of my initial annotation to not endorse/state/assert/"fact-ualize" that statement.

That seems to correspond to what I meant. The rdf:reifies as a
predicate is strictly neutral, so no implication could be
"overwritten" (as in contradicted). And the type of the subject
reifier is enough to clarify what kinds of reification is intended
(such as whether it makes sense (or "is real in the world") or not
depending on the truth of the reified relationship(s)).

I suspect that you are worried that if someone describes a
"Qualification" using the annotation shorthand, and a modification of
that RDF Source subsequently yields a graph without that triple being
asserted, then that makes the description of the reifier incomplete
from the original point of view. Or conversely, that an "Objection" is
placed as an annotation on something that is also asserted.

This *may* be an expectation of annotations, but I haven't had that
expectation. Nor does e.g. Wikidata pages "look" like that (see e.g.
[1]). They (a bit like my proposed revision of the syntax) nest also
untrusted/controversial statements, marked in red, under the
description of the subject, and have a variable notion of "preferred"
and "normal". Whether a triple is asserted or not is a prerogative of
who or what manages the graphs. Reifiers can *inform* that decision,
and the decision itself can be described as a reifier too. But while
you could automate that (e.g. through OWL), I don't think such
entailments belong on the fundamental, simple RDF level. An annotation
syntax is enough to cater for fundamental maintenance.

(See my example at the end of this mail for how we might explore this
more concretely.)

> > But there is no need to add this distinction in the reifying relationship. It is enough to specify the appropriate `rdf:type` of the reifier.
> >
> > If you have a simple relationship:
> >
> >     <Alice> :bought <SomeComputer> .
> >
> > And a reifier which `rdf:reifies` that:
> >
> >     <purchase1> a :Purchase ;
> >         :date "2014-12-15"^^xsd:date ;
> >         :seller <ComputerStore> ;
> >         :cost 2500 ;
> >         :currency :USD ;
> >         rdf:reifies <<( <Alice> :bought <SomeComputer )>> .
> >
> > Then the only additional thing needed in RDF 1.2 is to ensure that a triple term entails something understandable by a reasoner, like a classic reification token:
> >
> >     <purchase1> rdf:reifies [ rdf:subject <Alice> ; rdf:predicate :bought ; rdf:object <SomeComputer> ] .
> >
> > (This has been suggested (as "T-entailment"?). I'm convinced that it is a critical requirement for interoperability (or some variant thereof, such as an rdf:Relationship subclass of rdf:Statement).)
>
> I think that is a variation of my proposal for a second property rdf12:instantiates, modulo that I define it as instantiation, not reification. Specifically, you need to express somewhere, somehow that "the only additional thing needed in RDF 1.2 is to ensure that a triple term entails something understandable by a reasoner, like a classic reification token" is actually triggered. That is what in my proposal differentiates rdf12:instantiates from rdf12:reifies, apart from the different semantics.

I believe the concept of reification has been overloaded by the old,
non-normative notion of it in RDF. In general (ER modeling), a
reification is a wider notion than e.g. relators (who specifically
mediate rather than generally reference). It includes them, N-ary
relationships, and qualifiers in Wikidata. It's somewhat different
from the notion of qualification if taken to mean subproperties (all
the way down to singleton properties, if one is inclined to use
those); which is a strict narrowing, a subset of the extension of a
property. But a reifier could imply such a more precise subproperty
too.

So even IF we do define something like rdf12:instantiates, I think it
is conceptually a subproperty of rdf:reifies. (Then I'd suggest to
call the more narrow one something like rdf:implies or rdf:relatorOf,
to distance it from the LPG relationship instance notion (which like
singleton properties can't fully cover N-ary relationships, who
mediate more than one relationship, not placing one in the center).
But still, I think the type of the reifier can express any such intent
if needed, informally, and as shown using OWL, also formally.)

> However, the whole approach of your argumentation is somehow backwards: you start with an n-ary relation, and then derive the main relation from it. But I don’t see an embedded occurrence term anywhere. So I’m not sure I get what you want to prove, and how.

It starts by deriving the n-ary relation *from a reifier* having a
rdf:reifies relation to an "embedded occurrence". But since RDF
doesn't have triple terms yet, it starts by assuming the "T-entailed"
form (of a classic reification). Then the simple relationship is
derived back again from that, to go full circle. This shows that they
all work as (very) different starting points that can still converge
semantically, which is a major feature for practical integration and
interoperability. So what is shown is that an N-ary and a reifier can
be *the same*, and you can even entail the former from a typed "triple
annotation" (i.e. a reifier of one "too simple" relationship).


> > I think what bothers you is that it would reasonably make sense to say that in this case, given the intuitive meaning of a Purchase, the simple `:bought` relationship should also hold (i.e. that triple should be entailed). And I agree, for this specific case.
>
> That case is specific, but very common. It is at the heart of many use cases - by rule of thumb all use cases that are not concerned with administrative provenance. It is often called qualification and as such omnipresent in Wikidata, customary in LPG, and inherent in many blank node-ish forks of an RDF graph. Essentially everything that refines and adds more detail to the meaning of a statement.

Of course. Making it more concrete, in one of many ways.

> > Not for the general use of `rdf:reifies`.
>
> That case is not more general, it is different. It addresses the statement as a whole, as an entity in its own right. This is useful to express that this statement came from there, that you have fact checked it, that it should be published on Saturday, etc. Essentially everything that is external/orthogonal to the meaning of the statement.

Yes, they are very different. But I claim that they all still fit
within the notion of general reification.

> > Specifically then, we want to say that IF something is a :Purchase which rdf:reifies a :bought relationship between ?x and ?y, THEN ?x :bought ?y.
> >
> > And we don't need anything more for that. This is already possible using OWL.
>
> All those variants of "everything is already possible":
> - just use n-ary relations
> - just mint more specialized properties
> - just use OWL
> - just use named graphs
> - just use singelton properties
>
> But this WG is tasked with specifying a simpler approach, and specifically base it on RDF-star because that was regarded to be simple. So all those other ways don’t count. That’s why I propose a second property. I see no easier way to support two so distinct modes of expression as "asserted" and "unasserted" assertion, and I see everythig else as even more involved.

Everything else is certainly more involved. I use OWL to see if I can
formally express these "intuitions" I might harbor for classes and
properties, and shorthands thereof. But with RDF-star we are certainly
about to add triple terms as a simple *starting point*--for describing
something more concrete behind simple, flat relationships, since those
sometimes prove too simple somewhere down the road. I want to ensure
that these roads can converge.

It is of course valuable to examine if the simple start is too simple or not.

> Your OWL construct below, if it withstands more scrutiny than I’m able to apply, may be very useful to precisely define the meaning of these two simple properties, but it can’t replace them.

I hope it can be useful either way. It might be enough, unless there
is a use case that clearly fails if these nuances aren't built into
core RDF. But building that in may also be a bit *too* expressive for
a simple start.

> On a more fundamental note,
>
> I’m not sure where all this expressivity that I’m championing leads to. The vision I support is one of primary relations with a lot of added expressivity (qualifying and reifying). That is very much in line with the spirit of Property Graphs and a refinement of the original graph metaphor of RDF, albeit pushing its limits considerably. It’s also in line with how Wikidata evolved on the simple graph model of RDF. I’m however not sure if this leads to easier interoperability.

These seem like important considerations, from both angles (too
inexpressive if left out, or harder to use right if added).

> The model you champion in your example above is increasigly common wherever RDF is used in expert settings, especially those that make use of OWL. There constructs get pretty complicated pretty quickly, some of them are a far cry from the original graph vision (basically just RDBMS with the added graph baggage) and "shortcut relations" are introduced to help users navigate the ensuing deeply entangled mess. Those shortcut relations essentially repeat the most essential relations as bare triples, without all the detail that the n-ary relations provide.

Yes, there is a distance between simple relationship reifiers with
some added sources or dates, and using OWL to entail a complete N-ary
relator design. I do think we all would like for those two ends not to
be cul-de-sacs though, model-wise (as they mostly are today). I am
hopeful that we're at a stage where the baseline already covers the
distance, and we're just debating details of the most minimal form
required, to check if that covers the basics. If so, we're still
converging on the whole (transparent, many-to-many).

> As I said in one of the last meetings, the only thing that provides me some kind of emotional comfort w.r.t. this WG is that RDF-star supports such shortcut relations well enough, no matter how hard we mess up, as they have very little requirements. Basically annotate the shortcut with where and what it is derived from, and you’re done. But I maintain that we can do much better. I fear that only practice will tell if it was worth it, because no one can really forsee how any additional modelling power will be used (and ease of use, easily accesible statememt annotation without the need to indulge in cryptic OWL constructs, is a modelling power too). Good intuitions will certainly help, but who would have known for sure in 2014 that LPG will be as successfull as they are. Or Wikidata, for that matter.

I sympathize with your concerns. I don't claim that you have to use
OWL to make your cases work (which can indeed be quite a puzzle). I
just try to show that the expressive power is there, if you need it,
based on something far simpler. (I think we're seeking something that
follows the rule of least power [2].)

As a test case, consider these four reifiers of an also true relationship:

    <Mary> :marriedTo <Mario> .

    _:r1 a :Marriage ;
        :startDate "2014-07-13" ;
        :endDate "2014-07-14" ;
        :at <LasVegasWeddingShack972> ;
        rdf:reifies <<( <Mary> :marriedTo <Mario> )>> .

    _:r2 a :MarriageCertificate ;
        :status :Annulled ;
        :dateIssued "2014-07-13" ;
        :dateAnulled "2014-07-14" ;
        :issuer <PastorPresley972> ;
        rdf:reifies <<( <Mary> :marriedTo <Mario> )>> .

    _:r3 a :Marriage ;
        :startDate "2014-10-01" ;
        :at <SanFransiscoCityHall> ;
        rdf:reifies <<( <Mary> :marriedTo <Mario> )>> .

    _:r4 a :MarriageCertificate ;
        :status :Valid ;
        :dateIssued "2014-10-01" ;
        :issuer <SanFransiscoCountyOffice> ;
        rdf:reifies <<( <Mary> :marriedTo <Mario> )>> .

Is it a problem that these could all syntactically shorten down to
multiple annotations on the one triple? Do you need different
predicates for the triple term in some of them for querying and
display to work? Would those have to be replaced if the marriage ended
(again), to consistently model the truth?

Best regards,
Niklas

[1]: <https://www.wikidata.org/wiki/Q30>
[2]: <https://en.wikipedia.org/wiki/Rule_of_least_power>





>
> Best,
> Thomas
>
>
> > We first need a "rolification" property for the :Purchase (that's just an existing "OWL trick", to be able to use class memberships in property chains):
> >
> >     :Purchase rdfs:subClassOf [ owl:onProperty _:RolifiedPurchase ; owl:hasSelf true ] .
> >
> > And a class for the :bought relationship, which is additionally tied to a "rolified" property:
> >
> >     _:BoughtRelationship owl:equivalentClass [ owl:onProperty rdf:predicate ;
> >                                                owl:hasValue :bought ] ;
> >         rdfs:subClassOf [ owl:onProperty _:RolifiedBoughtRelationship ; owl:hasSelf true ] .
> >
> > Then we can define subproperty chain axioms as needed:
> >
> >     _:boughtRelation rdfs:subPropertyOf rdf:reifies ;
> >         owl:propertyChainAxiom (_:RolifiedPurchase rdf:reifies _:RolifiedBoughtRelationship) .
> >
> >     :buyer rdfs:domain :Purchase ;
> >         owl:propertyChainAxiom (_:boughtRelation rdf:subject) .
> >
> >     :item rdfs:domain :Purchase ;
> >         owl:propertyChainAxiom (_:boughtRelation rdf:object) .
> >
> > From this it is entailed that `<purchase1>` is a full-fledged N-ary relationship:
> >
> >     <purchase1> a :Purchase ;
> >         :date "2014-12-15"^^xsd:date ;
> >         :buyer <Alice> ;
> >         :seller <ComputerStore> ;
> >         :item <SomeComputer> ;
> >         :cost 2500 ;
> >         :currency :USD .
> >
> > And from that, is it easy to define a chain axiom to the simple relationship:
> >
> >     :bought owl:propertyChainAxiom ( [ owl:inverseOf :buyer ] :item ) .
> >
> > So that this is entailed (asserted):
> >
> >     <Alice> :bought <SomeComputer> .
> >
> > This is something the used ontology can define, to clarify the "intuitions" on the Purchase class, and also mechanically aid in data interoperability. Not all reifiers (such as data provenance / editorial management) have this "nature", so it should really be up to how the type of the reifier is defined. Which the above proves is already possible.
> >
> > To see that this works, try it out in an OWL tool such as the OWL-RL reasoner, which is also online courtesy of the Finnish National Library at [1]. That example link uses the example data above, which I put in a gist at [2]. (In that gist there is also a version where the `:seller` is not directly asserted, but `<purchase1>`  instead also reifies `<Alice> :shoppedAt <ComputerStore>`, and rules are added so that the `:seller` and `:shoppedAt` triples are entailed.)
> >
> > Aside: I think it is imperative that reifiers like `<purchase1>` are recognized, as shown above, as standard N-ary relationships. A lot of the *seemingly* simple triple annotations we've seen appear to be of that kind, in "disguise" (just a piecemeal description of). And N-aries are a common kind of relationship reification *not* limited to the rather narrow "triple token" notion informally defined in RDF. Specifically, the Purchase class is a kind of reifier known as a relator [2], which mediates the buyer, item, and seller, date, cost, currency. Which I think at least Enrico has been saying all along.
> >
> > So with `rdf:reifies`, transparent triple terms as objects, and the addition of a simple "T-entailment", we appear to already have all pieces required to cater for the use cases we have deemed reasonable, including ones where reified relationships should also be entailed to hold. With ontology-powered systems, they mighn't need triple terms to begin with. But the annotation syntax is a practical convenience for lots of real world systems, where the above OWL usage is far from prevalent today, or where other concerns are more pressing. As shown above, semantically, OWL has what is needed to make all approaches interoperable. (And a lot more is available, such as cardinality restrictions).
> >
> > Best regards,
> > Niklas
> >
> > [1]: <https://www.ldf.fi/service/owl-rl-reasoner?text=&source_1=https%3A%2F%2Fgist.githubusercontent.com%2Fniklasl%2F69428b043be6f1d33fd45f89cbe52632%2Fraw%2Fb3352d9648dc7710d7128af6ac731464f54ba836%2Fpurchase-entailments.ttl&iformat=turtle&format=turtle&fullClosure=no&owlClosure=yes&rdfsClosure=yes&owlExtras=yes&axioms=no&daxioms=no>
> > [2]: <https://gist.github.com/niklasl/69428b043be6f1d33fd45f89cbe52632>
> > [3]: <https://ontouml.readthedocs.io/en/latest/classes/sortals/relator/index.html>
> >
> >
> >
> > On Tue, Jul 9, 2024 at 4:50 PM Thomas Lörtsch <tl@rat.io> wrote:
> > >
> > >
> > >
> > > Am 9. Juli 2024 16:07:25 MESZ schrieb "Peter F. Patel-Schneider" <pfpschneider@gmail.com>:
> > > >There are much better names than "Unasserted assertions", largely because it is unclear just what "assertion" means and later examples bias its meaning. I see two choices for "assertion" - "embedded triple" or "triple occurence", where the difference is that there is precisely one embedded triple for a combination of subject, predicate, and object but there are multiple triple occurences for a combination.
> > >
> > > Assertion, the way I try to define it, refers to a triple, or fact, in a graph. The unasserted assertion, as I said, equals the subject of an RDF standard reification quad. However, just speaking of "facts/triples" and "statements" would most probably not convey the difference in meaning that this discussion centers around.
> > >
> > > >PROBLEM 1:   This is just yet another example of the problem with the "seminal example".   If you want to be able to have separate views of a triple then you have to somehow be able to create separate resources that all somehow refer to the triple.  The working group has already what appears to be a widely-acceptable proposal in this area with both a special prediate - rdf:reifies - and shorthand syntax for Turtle.   So this problem is already solved.
> > >
> > > This problem is not solved, as IMO my extremely concise example below clearly shows. Please provide a counter example to prove your point.
> > >
> > > >PROBLEM 2:  This is just yet another complaint about the verbosity of RDF. RDF is indeed verbose (or lacking and thus requiring verbose encoding) for many things - node names, n-ary predicates, literals, lists, sets, alternatives, universals, disjunctions, modals, etc.  Adding some sort of special facility to the core of RDF to overcome this verbosity makes the core of RDF more complex, harder to implement, and probably harder to understand. So, yes, the verbosity of RDF is a problem but reducing verbosity comes at the decided cost of increased complexity.
> > >
> > > This is a discussion long resolved and the resolution was to set up this working group and have it define an easy way to make statements about statements. I think everybody is aware of the complexity that adds under the hood.
> > >
> > > >PROBLEM 2 and 3:  A way of getting around verbosity is shorthand syntax.  This has been in Turtle for quite some time, reducing the user-perceived verbosity of several aspects of RDF, particularly lists.   N-triples, of course, does not include shorthands but the fundamental design principle of N-triples is simplicity.  The working group has what appears to be a widely-acceptable proposed shorthand syntax for the common case of an asserted triple with qualifiers, covering by-and-large the LPG and Wikidata.   The shorthand syntax can also be used when querying, in many cases completely eliminating the need to see the expanded version.   There are some aspects of this shorthand that could be improved and there is a proposal already put forward to the working group on this.
> > >
> > > Problem 3 is not so much about syntax but about how reification is not the right approach to all those use cases that qualify assertions.
> > >
> > > >PROBLEM 4:  Turtle should be the only syntax that most users, particularly naive users, see.  Tools designed to produce RDF that users see should produce Turtle output and Turtle output that uses the shorthand syntax whereever possible.  Tools that perform differently are to be avoided.  But this is only a small part of the problem of producing readable output for any graph-based structure.  There are few tools that do this in a way that is even somewhat close to acceptable and no tools that I am aware of that do a great job for graphs of any size.  The problem is just very, very, very hard (and probably impossible due to the limited sensory and cognitive capabilities of humans). I would love to be proved wrong and have access to a tool that would provide a usable view on Wikidata for me, even just the ontology part.  In any case, lines of characters is the wrong format for just about all people to see RDF graphs of any size.
> > >
> > > The importance of N-Triples as the only syntax that really counts in implementation, has been stressed in WG meetings many times. Streaming contexts were mentioned as wasvexchsnge of fata over the wire. I was even asked to provide examples for the Nested Named Graphs proposal in N-Triples instead if TRIG. The different syntaxes should be equally expressive, or what was seemed to conveyed in a shorthand notation gets lost when transmitted over the wire.
> > >
> > > >In summary, I don't see anything in these problems that have either not been already addressed or are not a true problem at all.  That being the case, why should I bother to examine a proposal that doesn't address anything that has not already been adddressed.   New proposals should build on already-existing work of the working group, using the deliberations that have already been performed and the nomenclature and solutions that have already been proposed, and comparing the new solutions to them.
> > >
> > > In summary you make claims that are not substantiated ("solved") and you argue that if only the world was organized differently, the problems I describe would go away. The latter is not going to happen, the former would need a concrete example to be plausible.
> > >
> > > I maintain that with the current proposal it is impossible to annotate an unasserted assertion (equal to the subject of an RDF reification quad) AND have that same statement as a fact in the same graph. However, that has repeatedly been claimed to be a use case that we need to address. So, imo we clearly have a problem. And basing the occurrence semantics on  reification instead of instantiation is another problem and my proposed solution solves both in combination.
> > >
> > > Best,
> > > Thomas
> > >
> > > >peter
> > > >
> > > >
> > > >On 7/9/24 06:43, Thomas Lörtsch wrote:
> > > >> Hi all,
> > > >>
> > > >> as promised in the last WG meeting almost two weeks ago this is a summary of the issues I see with "unasserted assertions" and a proposal of how to resolve them and, in the same stroke, some other problems as well.
> > > >>
> > > >>
> > > >>
> > > >> DEFINITION
> > > >>
> > > >> "Unasserted assertions" - for lack of a better name - means statements that are described (and probably annotated), but are not contained as facts in the graph. I.e. those statements are talked about, but not endorsed. RDF standard reification is a construct to such effect: the reification quad describes a statement, but it doesn’t entail it.
> > > >>
> > > >>
> > > >> PROBLEM 1/4
> > > >>
> > > >> Both CG and WG repeatedly and explicity maintained that RDF-star needs to support "unasserted assertions". However, the way that RDF-star currently implements them is ambiguous, lossy and non-monotonic even. It relies completely on the absence of a fact from the graph. If however that fact is present, it is no longer possible to talk about it as refering to something unasserted.
> > > >>
> > > >> For example, we may want to document and comment on a statement without endorsing it. We write
> > > >>
> > > >>      << :s :p :o >> :a :b .
> > > >>
> > > >> If however that same statement is also to be part of the graph, for whatever reason, like so:
> > > >>
> > > >>      << :s :p :o >> :a :b .
> > > >>      :s :p :o .
> > > >>
> > > >> there is no way to express that the annotation is meant to refer to an unasserted statement.
> > > >> There are many situations in which this problem might occur: we might want to document different viewpoints or versions, graphs might be merged or updated, adding the fact, etc.
> > > >>
> > > >>
> > > >> PROBLEM 2/4
> > > >>
> > > >> As a way out of this problem I discussed to require another statement describing if an annotation is meant to annotate an unasserted assertion, like so:
> > > >>
> > > >>      << :s :p :o >> :a :b ;
> > > >>                     a rdf12:UnAssertedAssertion .
> > > >>      :s :p :o .
> > > >>
> > > >> However, this is not a valid solution, because a second problem runs even deeper: the meaning of triple terms is defined as reification. Because of those reification semantics a triple term is always unasserted. By consequence we would rather need to add a statement whenever we intend to anotate a statement that we actually assert, like so:
> > > >>
> > > >>      << :s :p :o >> :c :d ;
> > > >>                     a rdf12:AssertedAssertion .
> > > >>      :s :p :o .
> > > >>
> > > >> In practice this would add considerable load as most annotations will aim to annotate facts in the graph, not e.g. some unendorsed viewpoints. This would also require more effort when querying.
> > > >>
> > > >>
> > > >> PROBLEM 3/4
> > > >>
> > > >> On the plus side the approach in 2/4 could be considered to be very expressive as the actual fact would now be completely independent from any annotations, on asserted and unasserted assertions alike, adding a new degree of expressivity. However, for many use cases this separation is rather a problem than a feature, because they ask for a clear and solid connection between a fact and "its" annotation. All use cases of qualification fall in this category, e.g. Wikidata, LPG, and many more. OTOH, for use cases that aim to annotate statement with rather orthogonal aspects like provenance, refication is the right choice. So just changing the underlying semantics from one to the other is not a solution. We rather need both.
> > > >>
> > > >>
> > > >> PROBLEM 4/4
> > > >>
> > > >> The shorthand syntax by intuition provides a solid link between a stated fact and its annotation. However, no other syntax does that, so an unassuming user's intuition is betrayed when the data is serialized to e.g. N-Triples. This is a serious usability problem.
> > > >>
> > > >>
> > > >>
> > > >> PROPOSAL
> > > >>
> > > >> To properly support unasserted assertions, and to solve the semantics problems in the same stroke, let's bite the bullet and define two primitives instead of one:
> > > >> - an unasserted triple term occurrence with a semantics of reification
> > > >> - an asserted triple term occurrence with a semantics of instantiation
> > > >> Define those primitives and their semantics not in the abstract syntax, but via the two properties rdf12:reifies and rdf12:instantiates.
> > > >>
> > > >>
> > > >> Abstract Syntax:
> > > >>
> > > >> graph      ::= triple*
> > > >> triple     ::= subject predicate object
> > > >> subject    ::= iri | BlankNode
> > > >> predicate  ::= iri
> > > >> object     ::= iri | BlankNode | literal | tripleTerm
> > > >> tripleTerm ::= triple
> > > >>
> > > >>
> > > >> Properties:
> > > >>
> > > >> To provide the user facing triple term occurrences with their respective semantics (un-/asserted etc) we explicitly (and normatively) define two properties, with different semantics, to be used with abstract triple terms as their rdfs:range:
> > > >> - rdf12:reifies defines a reification of an abstract triple term
> > > >> - rdf12:instantiates defines an instantiation of an abstract triple term
> > > >>
> > > >> A reification via rdf12:reifies doesn’t assert the statement described by the triple term, it merely provides an identifier to refer to an occurrence (whereever, whenever) of it. IIUC that is exactly what we have now. A mapping to RDF 1.1 clarifies that:
> > > >>
> > > >>      :r_1 rdf12:reifies <<( :s :p :o )>> .
> > > >>
> > > >> in RDF 1.1 would be expressed as
> > > >>
> > > >>      :r_1 rdf12:reifies [
> > > >>          rdf:subject :s ;
> > > >>          rdf:predicate :p ;
> > > >>          rdf:object :o .
> > > >>      ]
> > > >>      rdf12:reifies rdfs:range rdf:Statement .    # axiomatic triple
> > > >>
> > > >> An instantiation via rdf12:instantiates OTOH does indeed assert the statement it annotates, in addition to providing an identifier to annotate that assertion. To make the connection between statement and annotation direct and solid, but not break the set semantics of RDF, the model and semantics mimick the singleton property approach. A mapping to RDF 1.1 clarifies that:
> > > >>
> > > >>     :i_1 rdf12:instantiates <<( :s :p :o )>> .
> > > >>
> > > >> in RDF 1.1 would be expressed as
> > > >>
> > > >>      :i_1 rdf12:instantiates [
> > > >>          :s :p_1 :o .
> > > >>          :p_1 rdf12:instantiatesProperty :p .
> > > >>      ]
> > > >>      :s :p :o .
> > > >>      rdf12:instantiatesProperty
> > > >>          rdfs:subPropertyOf rdf:type .           # axiomatic triple
> > > >>
> > > >>
> > > >> Macro:
> > > >>
> > > >> An instantiation always entails the triple term, ':s :p :o' in the above example. This is defined as a macro when mapping between concrete syntaxes.
> > > >>
> > > >>
> > > >> Concrete syntaxes:
> > > >>
> > > >> I see two possible approaches. One is to not change the currently defined syntaxes, but let the shorthand syntax express instantiation, and let the standard syntax express reification. That would align syntaxes as they are defined right now with the intuitions they support, but all modifications would happen "under the hood" - e.g.:
> > > >>
> > > >>      :s :p :o  {| :a :b |}      # rdf12:instantiates, asserted and annotated
> > > >>      << :s :p :o >> :d :e .     # rdf12:reifies, not asserted but annotated
> > > >>
> > > >> Another approach would be to let instantiation be expressed with double chevrons and reification with triple chevrons, e.g.:
> > > >>
> > > >>      << :s :p :o >> :a :b.      # rdf12:instantiates, asserted and annotated
> > > >>      <<< :s :p :o >>> :d :e .   # rdf12:reifies, not asserted but annotated
> > > >>
> > > >> The first approach, based on the current syntaxes, does more clearly disambiguate the two modes of expression, but it also adds more "unrest". The second approach stays closer to the orignal RDF* proposal and is more uniform. The second approach might enable a more usable query interface, e.g.:
> > > >>
> > > >>      ?s ?p ?o                   # asserted
> > > >>      << ?s ?p ?o >> ?a ?b       # asserted and annotated
> > > >>      <? ?s ?p ?o ?> ?a ?b       # asserted and optionally annotated
> > > >>      <<< ?s ?p ?o >>> ?a ?b     # unasserted and annotated
> > > >>      <<? ?s ?p ?o ?>> ?a ?b     # (asserted or unasserted) and annotated
> > > >>      <?? ?s ?p ?o ??> ?a ?b     # (asserted or unasserted) and optionally annotated
> > > >>
> > > >> Occurrence identifiers are omitted in all examples, as they don’t differ from the current proposal. Likewise the mapping to N-Triples doesn’t change, except from the introduction of a new property, rdf12:instantiates.
> > > >>
> > > >>
> > > >>
> > > >> DISCUSSION
> > > >>
> > > >>
> > > >> In principle:
> > > >> RDF is an Open World technology, designed to facilitate decentralized authoring and integration of data and we can’t rely on the absence of statements to convey meaning (e.g. unassertedness or non-endorsement). Out-of-band arrangements in concrete application may be more specific, but we can not make any claims based on that (and "RDF-star supports unasserted assertions" would be such a claim).
> > > >> I’d like us to go the extra mile and adopt the above proposal: implement two different kinds of annotation primitives. This also opens the road to a future with more elaborate constructs like quoted versioning.
> > > >> I could also live with a scaled down reference to the support of unasserted assertions, on the lines of "you can emulate a surrogate support for unasserted assertions like this, but be aware that the construct easily breaks in practice if not tightly controlled".
> > > >> However, that would still not solve the other problem, namely that reification is not the right formalisation for most of our use cases.
> > > >>
> > > >> Properties:
> > > >> defining the semantics of triple term occurrences via the properties rdf12:reifies and rdf12:instantiates is a modification of the mechanism introduced by the TEP proposal in the RDF-star CG report.
> > > >>
> > > >> Instantiation:
> > > >> There are different names for the underlying concept. It can also be understood as a form of n-ary relation where the instance-type relationship is modeled via a blank node. So the term "instantiation" refers to its most theoretical aspect (which best mirrors "reification", and that’s why I chose it in this summary), "n-ary relation" would refer to the way of modelling it, and yet another term "qualification" would emphasize the meaning of the construct. "Singleton properties" is a term that I try to avoid because the proposal has been met with so much resistance, but it is a concrete implementation of the same concept. I provide a very singleton property like mapping to RDF 1.1 above, but slightly different mappings could provide better computational properties, e.g. letting the object refer to the singleton, resulting in better indexing and join performance.
> > > >> However, what is most important to me is that the qualifying annotation, by being attached to the instantiation, is unmistakeably annotating a statement that is actually asserted in the graph. The only metaphysical baggage involved is the definition of what a type-instance relation is, and that should be uncontroversial.
> > > >> Defining the relation between annotated thing and annotating thing not as a subproperty-relation but as a type-instance-relation follows the idea that those annotated relations are not meant to be annotated any further (that too is possible, but not the norm). So they represent leaves of an inheritance tree rather than knots. In OO tradition leaves are understood as instances, whereas knots would be defined as sub-properties.
> > > >>
> > > >> Mappings:
> > > >> The mappings to RDF 1.1 are provided for two reasons:
> > > >> - clarify the meaning of new constructs in terms of well-known ones
> > > >> - provide a path to backwards compatability.
> > > >> RDF-star implementations should not be required to support them on the implementation level.
> > > >>
> > > >> Many-to-many:
> > > >> The astute reader will have noticed that both mappings to RDF 1.1 are many-to-many. However, owing to the semantics inherent in instantiation, this is a kind of many-to-many relationship that only properly supports co-denoting statements. A common instantiator for :Car and :MotorizedVehicle makes sense, whereas for conceptually very different entities like :Car and :Person it mostly does not. Such a semantics remains true to the integration focus of RDF, but also caters to the more focused approach of LPG. Reification on the other hand rather refers to the statement(s) as a whole, as an entity in their own right. This caters well to use cases that explicitly don’t want to qualify statements but that that try to keep a safe distance between statement and annotation, e.g. provenance and other orthogonal concerns.
> > > >>
> > > >> Querying:
> > > >> Do we currently allow to query for the abstract triple term <<( … )>> or do we plan/have to do so? If yes, that might be a better replacement for the last option above: <?? … ?>>. In any case the above proposal is just a sketch and may not even be helpful. I’m not good at querying.
> > > >>
> > > >> Abstract triple:
> > > >> People can use the abstract triple term as object of whatever statements they like. They are on their own with such freewheeling usage, as no other semantics are defined in the spec than those of rdf12:reifies and rdf12:instantiates. However, this may lead to some fruitful experiments, e.g. with referentially opaque triple terms, and it stays in the spririt of RDF being a pretty open technology.
> > > >>
> > > >> Fragment identifiers:
> > > >> Reification defines a handle to address the whole triple as an object, instantiation rather defines a handle to address the predicate of the triple. Therefore an annotation on a reifier annotates the whole triple as an object, an annotation on an instantiator rather qualifies the relation itself. Both arrangements are not set into stone, and adding more specific properties to explicitly annotate and qualify the subject, predicate, object or whole triple (or a set thereof) may be defined. The RDF reification vocabulary might be reused to that end, but defining a new set of properties seems to be the safer approach. Such properties may be applied to reifications as well as instantiations (and there are use cases for both).
> > > >>
> > > >> Merging and temporal aspects:
> > > >> It was argued that the problems outlined above are not actual problems for us but related to issues outside of RDF-star, i.e. merging is not an issue for RDF-star and temporal aspects are not considered in RDF at all. I disagree with both accounts: RDF is a technology focused on decentarlized data integration. Such integration requires merges, and it leads to the addition of statements in existing graphs. In both cases the situation may arise that a statement that was annotated but not endorsed is added as an actual fact. Then what was meant to be unendorsed suddenly is endorsed. Problems with other use cases, like representing different viewpoints, notwithstanding.
> > > >>
> > > >>
> > > >>
> > > >> Best,
> > > >> Thomas
> > > >
> > >
>
Received on Wednesday, 10 July 2024 19:15:51 UTC