- From: Laufer <carlos.laufer@gmail.com>
- Date: Mon, 20 Dec 2021 13:46:52 -0300
- To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
- Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
- Message-ID: <CAFg8H7i9Nkx0jmJZASN=y_invDV3Hx_zV0eK9Wxs3WhgbR8VXQ@mail.gmail.com>
Thank you all for the responses. Cheers Em segunda-feira, 20 de dezembro de 2021, Pierre-Antoine Champin < pierre-antoine.champin@ercim.eu> escreveu: > > On 20/12/2021 11:48, thomas lörtsch wrote: > >> Hi Laufer, >> >> singleton properties and RDF-star are both approachs to statement >> annotation in RDF. There are more approaches, like RDF standard >> reification, named graphs etc. >> > > What Thomas said. > > Note also that a comparison of those approaches is given in the Lotico > presentation on RDF-star that Olaf and I gave in March: > > https://www.youtube.com/watch?v=ZNfq12mdnsM&t=445s > > best > > I you want to discuss the topic of how they (or some of them) compare I >> suggest you open a new thread with that topic. >> >> Best, >> Thomas >> >> >> Am 20.12.2021 um 04:17 schrieb Laufer <carlos.laufer@gmail.com>: >>> >>> Hello, All, >>> >>> I was wondering how this discussion is related to the Singleton Property >>> proposal [1]. >>> >>> Cheers, >>> Laufer >>> >>> [1] - Vinh Nguyen, Olivier Bodenreider, and Amit Sheth,; "Don't Like >>> Reification? Making Statements about Statements Using Singleton Property"; >>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4350149/ >>> >>> Em domingo, 19 de dezembro de 2021, thomas lörtsch <tl@rat.io> escreveu: >>> tl;dr >>> RDF semantics is based on sets and RDF-star builds on that. However >>> RDF-star triple annotation has to deal with the practice of RDF, not its >>> theoretical ideal. In RDF as practically employed multisets, although not >>> the norm, can appear almost everywhere. A design that ignores them per >>> default but requires rewriting data and queries when they appear will not >>> fare well in practice. The problem is inherent in the verbosity of the >>> quoted triple identifier: it favors a syntax that is in almost all cases at >>> least risky, if not outright wrong. The shortcut syntax might provide a way >>> out of this dilemma. >>> >>> >>> The following examples should illustrate that multisets have to be >>> expected almost everywhere in RDF data. From now on I’m always assuming the >>> standard use case where an actual assertion is annotated: >>> >>> #0 :Bob :bought :Car . >>> :RichardB :marriedTo :LizT . >>> :Alice :plays :Guitar . >>> >>> >>> The CG report says that 'Alice said that Bob bought a car' should be >>> modeled not as >>> >>> #1 <<:Bob :bought :Car>> :said :Alice . >>> >>> but as >>> >>> #2 [] :occurrenceOf <<:bob :bought :Car>> ; >>> :said :Alice ; >>> >>> because there might be other sources for the same statement. That’s >>> always possible so it seems reasonable to always require the indirection of >>> creating a proper occurrence identifier when annotating a statement with >>> provenance. >>> >>> >>> Likewise it was recently discussed that marriages between Richard Burton >>> and Elizabeth Taylor should not be modeled as >>> >>> #3 <<:RichardB :marriedTo :LizT>> :start 1966 . >>> >>> but rather as >>> >>> #4 [] :occurrenceOf <<:RichardB :marriedTo :LizT>> ; >>> :start 1966 . >>> >>> beacuse we know of that second marriage. >>> >>> But what if we didn’t? What if we had authored this in 1967, assuming >>> that this marriage will last forever? Would we have chosen the more >>> involved modelling style nonetheless? And if we did go with the succinct #3 >>> version - very probably, at least according to current thinking I assume - >>> will we later, after their second marriage, have to change that to #4 style? >>> >>> What about querying? Say we are not sure if some statement occurs only >>> once or multiple times: will we have to query for both modelling styles? >>> Probably. >>> >>> >>> While the first example could be categorized as describing a speech act >>> and the second example might be considered instantiation there’s also the >>> case of subclassing. For example we might want to describe that Alice >>> happily plays guitar: >>> >>> #5 <<:Alice :plays :Guitar>> :mood :Happy . >>> >>> The other day however she plays guitar because she's sad: >>> >>> #6 <<:Alice :plays :Guitar>> :mood :Gloomy . >>> >>> "So which one is it?" the unexpecting data consumer might complain. It >>> turns out that indeed we should have chosen the more involved style right >>> away. >>> And that is precisely my concern: the succinct modelling style as in #1, >>> #3, #5 and #6 only works if we can be _sure_ that we are dealing with >>> triples as types - not occurrences, not instances, not subtypes, not >>> whatever other (not so) special cases there might exist. >>> >>> The succinct triple-as-type style only works for use cases that the >>> proposed semantics was optimized for, when working on the very low levels >>> of RDF machinery. In any other case the succinct style can be used first >>> but might need to be changed later, and it requires queries to account for >>> both modelling styles. Both prospects are bad enough to warrant a general >>> rule that says: don’t use the succinct style, use the indirection via >>> creating a statement identifier if you are not really sure that your use >>> case is Explainable AI, versioning or similiarily close to the metal. >>> >>> >>> In my understanding the problem stems from the very core of RDF-star’s >>> design: RDF-star quoted triples are verbose in that they quote in full what >>> they identify. That leads to moral hazard: it’s all too easy to take the >>> shortest path and use the type as an identifier where one should mint a >>> proper identifier first. The proposed semantics take advantage of that >>> verbosity and put it to good use of it for those special use cases that >>> require a carbon copy of their subject. But it is not well suited for >>> annotations that influene the meaning of the annotated triple. Maybe it >>> helps to think about the problem this way: property graph style modelling >>> allows to keep the simple triple and yet enrich it with additional detail. >>> But one must admit that the simple triple annotated in two different ways >>> is then not the same triple anymore. >>> >>> >>> I was all along (summer of 2020 IIRC) arguing for proper statement >>> identifiers like RDF/XML provides them and I still think they are the right >>> solution for mainstream use cases as they are much closer to the reality of >>> RDF data and therefore better positioned to capture deviations from the >>> abstract RDF core. Maybe there is a middle ground in the shortcut syntax >>> which could be defined as expanding to identifiers by default - e.g.: >>> >>> :Alice :plays :Guitar {| :mood :Happy |} >>> :Alice :plays :Guitar {| :mood :Moody |} >>> >>> expanding to >>> >>> :Alice :plays :Guitar . >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; >>> :mood :Happy. >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; >>> :mood :Moody . >>> >>> This is guaranteed to be correct for single _and_ multiple occurrences >>> alike, it is easy to author per the shorthand syntax and it is unambiguous >>> to query. >>> All more involved use cases - explainable AI, unasserted assertions etc >>> - work as before, as intended, using the quoted triple syntax. >>> I’d very much favor that default expansion to use a transparency >>> enabling version of :occurrenceOf in which case the shorthand syntax would >>> really be the syntactic sugar for RDF stanard reification that RDF-star was >>> - and, I guess, outside these specialist circles still is - expected to be. >>> That wouldn’t hurt the specialist use cases in any way. >>> >>> >>> Best, >>> Thomas >>> >>> >>> P.S. w.r.t. "a can of worms": Knowledge representation is indeed a can >>> of worms, and always has been, at least since the old greeks. Statement >>> annotation in RDF is a topic well known to be situated right in the heart >>> of the worm hole. There’s not simple genius way around that. >>> >>> >>> -- >>> >>> 劳费尔 >>> . . . .. . . >>> . . . .. >>> . .. . >>> >>> >> -- 劳费尔 . . . .. . . . . . .. . .. .
Received on Monday, 20 December 2021 16:48:07 UTC