- From: Miel Vander Sande <miel.vandersande@meemoo.be>
- Date: Mon, 20 Dec 2021 15:48:43 +0100
- To: Doerthe Arndt <doerthe.arndt@tu-dresden.de>
- Cc: thomas lörtsch <tl@rat.io>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
- Message-ID: <CAHeRLWsWFEC3V_v8Uk-HKhTmL8HxAB4D-qC0GG9nHN8jtS8MAA@mail.gmail.com>
Hi Thomas, all, I do agree that some extra usability on this aspect would definitely not hurt. It would have quite some gains in practice, just like RDF-star does over reification. Having this syntactical shorthand as a middle ground has popped up in my head a couple of times, but I hesitated to ask about it because: - it seems very likely that this idea has come up before in the CG. Probably I missed it - this only affects the syntaxes, not the RDF-star semantics? - it would open up a can of blank nodes, unless you have the ability to also add an identifier - I would not overload the annotation syntax; that has it's own reasons to exist (and it raises questions about assertion ;)) - in the end, it's how you can query it that matters most. Can you make such shorthand work for SPARQL-star? - technically, this is a syntax enhancement that can be defined in a separate specification that extends Turtle-star a. o., but probably you want to stay away from Turtle-star-star Best, Miel Op ma 20 dec. 2021 om 15:20 schreef Doerthe Arndt < doerthe.arndt@tu-dresden.de>: > Dear Thomas, > > > Am 20.12.2021 um 14:32 schrieb thomas lörtsch <tl@rat.io>: > > > > > > > > Am 20. Dezember 2021 11:47:48 MEZ schrieb Doerthe Arndt < > doerthe.arndt@tu-dresden.de>: > >> Dear Thomas, > >> > >> Before going into full discussion mode again :), I would like to fully > understand your proposal, so please allow me one question: > >> > >> Why do you go for > >> > >>> :Alice :plays :Guitar . > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Happy. > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Moody . > >> > >> instead of > >> > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Happy. > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Moody . > >> > >> > >> with your short cut? > >> I am asking because especially with the marriedTo example looks to me > like a case where the statement changes its truth value over time (i.e. the > triple becomes false if the marriage ends, or could at least become false > depending on what „:marriedTo“ means). > >> > >> Maybe I simply missed that point in your previous explanations, so is > there a short answer why you personally would model it that way? > > > > It is my understanding of the (informal) property :occurrenceOf that it > doesn't assert that statement, just points to it. Isn't that the assumption > everybody is working under? > > > > Yes, it is. My question was more on why you want to assert the triple you > are talking about even in cases where you know that it is not true t the > time you state it. But I guess the answer to that is that you would like to > be close to property graphs and there, all triples you refer to, are also > asserted. So I got my answer (if I understood correctly). Of course, I > disagree that this is a good way to model your examples ;) but I think that > has already been discussed in depth on this list. > > Kind regards, > Dörthe > > > Best, > > Thomas > > > >> Kind regards, > >> Dörthe > >> > >> > >> > >>> Am 20.12.2021 um 01:31 schrieb thomas lörtsch <tl@rat.io>: > >>> > >>> tl;dr > >>> RDF semantics is based on sets and RDF-star builds on that. However > RDF-star triple annotation has to deal with the practice of RDF, not its > theoretical ideal. In RDF as practically employed multisets, although not > the norm, can appear almost everywhere. A design that ignores them per > default but requires rewriting data and queries when they appear will not > fare well in practice. The problem is inherent in the verbosity of the > quoted triple identifier: it favors a syntax that is in almost all cases at > least risky, if not outright wrong. The shortcut syntax might provide a way > out of this dilemma. > >>> > >>> > >>> The following examples should illustrate that multisets have to be > expected almost everywhere in RDF data. From now on I’m always assuming the > standard use case where an actual assertion is annotated: > >>> > >>> #0 :Bob :bought :Car . > >>> :RichardB :marriedTo :LizT . > >>> :Alice :plays :Guitar . > >>> > >>> > >>> The CG report says that 'Alice said that Bob bought a car' should be > modeled not as > >>> > >>> #1 <<:Bob :bought :Car>> :said :Alice . > >>> > >>> but as > >>> > >>> #2 [] :occurrenceOf <<:bob :bought :Car>> ; > >>> :said :Alice ; > >>> > >>> because there might be other sources for the same statement. That’s > always possible so it seems reasonable to always require the indirection of > creating a proper occurrence identifier when annotating a statement with > provenance. > >>> > >>> > >>> Likewise it was recently discussed that marriages between Richard > Burton and Elizabeth Taylor should not be modeled as > >>> > >>> #3 <<:RichardB :marriedTo :LizT>> :start 1966 . > >>> > >>> but rather as > >>> > >>> #4 [] :occurrenceOf <<:RichardB :marriedTo :LizT>> ; > >>> :start 1966 . > >>> > >>> beacuse we know of that second marriage. > >>> > >>> But what if we didn’t? What if we had authored this in 1967, assuming > that this marriage will last forever? Would we have chosen the more > involved modelling style nonetheless? And if we did go with the succinct #3 > version - very probably, at least according to current thinking I assume - > will we later, after their second marriage, have to change that to #4 > style? > >>> > >>> What about querying? Say we are not sure if some statement occurs only > once or multiple times: will we have to query for both modelling styles? > Probably. > >>> > >>> > >>> While the first example could be categorized as describing a speech > act and the second example might be considered instantiation there’s also > the case of subclassing. For example we might want to describe that Alice > happily plays guitar: > >>> > >>> #5 <<:Alice :plays :Guitar>> :mood :Happy . > >>> > >>> The other day however she plays guitar because she's sad: > >>> > >>> #6 <<:Alice :plays :Guitar>> :mood :Gloomy . > >>> > >>> "So which one is it?" the unexpecting data consumer might complain. It > turns out that indeed we should have chosen the more involved style right > away. > >>> And that is precisely my concern: the succinct modelling style as in > #1, #3, #5 and #6 only works if we can be _sure_ that we are dealing with > triples as types - not occurrences, not instances, not subtypes, not > whatever other (not so) special cases there might exist. > >>> > >>> The succinct triple-as-type style only works for use cases that the > proposed semantics was optimized for, when working on the very low levels > of RDF machinery. In any other case the succinct style can be used first > but might need to be changed later, and it requires queries to account for > both modelling styles. Both prospects are bad enough to warrant a general > rule that says: don’t use the succinct style, use the indirection via > creating a statement identifier if you are not really sure that your use > case is Explainable AI, versioning or similiarily close to the metal. > >>> > >>> > >>> In my understanding the problem stems from the very core of RDF-star’s > design: RDF-star quoted triples are verbose in that they quote in full what > they identify. That leads to moral hazard: it’s all too easy to take the > shortest path and use the type as an identifier where one should mint a > proper identifier first. The proposed semantics take advantage of that > verbosity and put it to good use of it for those special use cases that > require a carbon copy of their subject. But it is not well suited for > annotations that influene the meaning of the annotated triple. Maybe it > helps to think about the problem this way: property graph style modelling > allows to keep the simple triple and yet enrich it with additional detail. > But one must admit that the simple triple annotated in two different ways > is then not the same triple anymore. > >>> > >>> > >>> I was all along (summer of 2020 IIRC) arguing for proper statement > identifiers like RDF/XML provides them and I still think they are the right > solution for mainstream use cases as they are much closer to the reality of > RDF data and therefore better positioned to capture deviations from the > abstract RDF core. Maybe there is a middle ground in the shortcut syntax > which could be defined as expanding to identifiers by default - e.g.: > >>> > >>> :Alice :plays :Guitar {| :mood :Happy |} > >>> :Alice :plays :Guitar {| :mood :Moody |} > >>> > >>> expanding to > >>> > >>> :Alice :plays :Guitar . > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Happy. > >>> [] :occurrenceOf <<:Alice :plays :Guitar>> ; > >>> :mood :Moody . > >>> > >>> This is guaranteed to be correct for single _and_ multiple occurrences > alike, it is easy to author per the shorthand syntax and it is unambiguous > to query. > >>> All more involved use cases - explainable AI, unasserted assertions > etc - work as before, as intended, using the quoted triple syntax. > >>> I’d very much favor that default expansion to use a transparency > enabling version of :occurrenceOf in which case the shorthand syntax would > really be the syntactic sugar for RDF stanard reification that RDF-star was > - and, I guess, outside these specialist circles still is - expected to be. > That wouldn’t hurt the specialist use cases in any way. > >>> > >>> > >>> Best, > >>> Thomas > >>> > >>> > >>> P.S. w.r.t. "a can of worms": Knowledge representation is indeed a can > of worms, and always has been, at least since the old greeks. Statement > annotation in RDF is a topic well known to be situated right in the heart > of the worm hole. There’s not simple genius way around that. > >> > >
Received on Monday, 20 December 2021 14:49:24 UTC