- From: Laufer <carlos.laufer@gmail.com>
- Date: Mon, 20 Dec 2021 00:17:36 -0300
- To: thomas lörtsch <tl@rat.io>
- Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
- Message-ID: <CAFg8H7j5D65-qMyG1-pJCCxfdGZg3KijE0fiitvahUS0j08-EA@mail.gmail.com>
Hello, All, I was wondering how this discussion is related to the Singleton Property proposal [1]. Cheers, Laufer [1] - Vinh Nguyen, Olivier Bodenreider, and Amit Sheth,; "Don't Like Reification? Making Statements about Statements Using Singleton Property"; https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4350149/ Em domingo, 19 de dezembro de 2021, thomas lörtsch <tl@rat.io> escreveu: > tl;dr > RDF semantics is based on sets and RDF-star builds on that. However > RDF-star triple annotation has to deal with the practice of RDF, not its > theoretical ideal. In RDF as practically employed multisets, although not > the norm, can appear almost everywhere. A design that ignores them per > default but requires rewriting data and queries when they appear will not > fare well in practice. The problem is inherent in the verbosity of the > quoted triple identifier: it favors a syntax that is in almost all cases at > least risky, if not outright wrong. The shortcut syntax might provide a way > out of this dilemma. > > > The following examples should illustrate that multisets have to be > expected almost everywhere in RDF data. From now on I’m always assuming the > standard use case where an actual assertion is annotated: > > #0 :Bob :bought :Car . > :RichardB :marriedTo :LizT . > :Alice :plays :Guitar . > > > The CG report says that 'Alice said that Bob bought a car' should be > modeled not as > > #1 <<:Bob :bought :Car>> :said :Alice . > > but as > > #2 [] :occurrenceOf <<:bob :bought :Car>> ; > :said :Alice ; > > because there might be other sources for the same statement. That’s always > possible so it seems reasonable to always require the indirection of > creating a proper occurrence identifier when annotating a statement with > provenance. > > > Likewise it was recently discussed that marriages between Richard Burton > and Elizabeth Taylor should not be modeled as > > #3 <<:RichardB :marriedTo :LizT>> :start 1966 . > > but rather as > > #4 [] :occurrenceOf <<:RichardB :marriedTo :LizT>> ; > :start 1966 . > > beacuse we know of that second marriage. > > But what if we didn’t? What if we had authored this in 1967, assuming that > this marriage will last forever? Would we have chosen the more involved > modelling style nonetheless? And if we did go with the succinct #3 version > - very probably, at least according to current thinking I assume - will we > later, after their second marriage, have to change that to #4 style? > > What about querying? Say we are not sure if some statement occurs only > once or multiple times: will we have to query for both modelling styles? > Probably. > > > While the first example could be categorized as describing a speech act > and the second example might be considered instantiation there’s also the > case of subclassing. For example we might want to describe that Alice > happily plays guitar: > > #5 <<:Alice :plays :Guitar>> :mood :Happy . > > The other day however she plays guitar because she's sad: > > #6 <<:Alice :plays :Guitar>> :mood :Gloomy . > > "So which one is it?" the unexpecting data consumer might complain. It > turns out that indeed we should have chosen the more involved style right > away. > And that is precisely my concern: the succinct modelling style as in #1, > #3, #5 and #6 only works if we can be _sure_ that we are dealing with > triples as types - not occurrences, not instances, not subtypes, not > whatever other (not so) special cases there might exist. > > The succinct triple-as-type style only works for use cases that the > proposed semantics was optimized for, when working on the very low levels > of RDF machinery. In any other case the succinct style can be used first > but might need to be changed later, and it requires queries to account for > both modelling styles. Both prospects are bad enough to warrant a general > rule that says: don’t use the succinct style, use the indirection via > creating a statement identifier if you are not really sure that your use > case is Explainable AI, versioning or similiarily close to the metal. > > > In my understanding the problem stems from the very core of RDF-star’s > design: RDF-star quoted triples are verbose in that they quote in full what > they identify. That leads to moral hazard: it’s all too easy to take the > shortest path and use the type as an identifier where one should mint a > proper identifier first. The proposed semantics take advantage of that > verbosity and put it to good use of it for those special use cases that > require a carbon copy of their subject. But it is not well suited for > annotations that influene the meaning of the annotated triple. Maybe it > helps to think about the problem this way: property graph style modelling > allows to keep the simple triple and yet enrich it with additional detail. > But one must admit that the simple triple annotated in two different ways > is then not the same triple anymore. > > > I was all along (summer of 2020 IIRC) arguing for proper statement > identifiers like RDF/XML provides them and I still think they are the right > solution for mainstream use cases as they are much closer to the reality of > RDF data and therefore better positioned to capture deviations from the > abstract RDF core. Maybe there is a middle ground in the shortcut syntax > which could be defined as expanding to identifiers by default - e.g.: > > :Alice :plays :Guitar {| :mood :Happy |} > :Alice :plays :Guitar {| :mood :Moody |} > > expanding to > > :Alice :plays :Guitar . > [] :occurrenceOf <<:Alice :plays :Guitar>> ; > :mood :Happy. > [] :occurrenceOf <<:Alice :plays :Guitar>> ; > :mood :Moody . > > This is guaranteed to be correct for single _and_ multiple occurrences > alike, it is easy to author per the shorthand syntax and it is unambiguous > to query. > All more involved use cases - explainable AI, unasserted assertions etc - > work as before, as intended, using the quoted triple syntax. > I’d very much favor that default expansion to use a transparency enabling > version of :occurrenceOf in which case the shorthand syntax would really be > the syntactic sugar for RDF stanard reification that RDF-star was - and, I > guess, outside these specialist circles still is - expected to be. That > wouldn’t hurt the specialist use cases in any way. > > > Best, > Thomas > > > P.S. w.r.t. "a can of worms": Knowledge representation is indeed a can of > worms, and always has been, at least since the old greeks. Statement > annotation in RDF is a topic well known to be situated right in the heart > of the worm hole. There’s not simple genius way around that. > -- 劳费尔 . . . .. . . . . . .. . .. .
Received on Monday, 20 December 2021 03:18:51 UTC