- From: Kurt Cagle <kurt.cagle@gmail.com>
- Date: Fri, 12 Apr 2024 12:42:36 -0700
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- Cc: Niklas Lindström <lindstream@gmail.com>, public-rdf-star-wg@w3.org
- Message-ID: <CALm0LSEh2Suzhp5YLm_tO7ejx=+r=kTChgUN-A3UbiPqDDfk6w@mail.gmail.com>
Sorry for the odd blank space at the bottom of the email. System fart. *Kurt Cagle* Editor in Chief The Cagle Report kurt.cagle@gmail.com 443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725> On Fri, Apr 12, 2024 at 12:40 PM Kurt Cagle <kurt.cagle@gmail.com> wrote: > > It seems that the WG is at an impasse. > > > How about reverting to an old situation where there are no reifiers at > all, > > just quoted triples, and require users to stand off from the triple as > required? > > I was at an IA conference yesterday, and the question of reification was > raised in several different contexts. I think it's important to remember > that reification is significant primarily because it is accommodating > (syntactically) parity with a neo4j construct. > > That is to say: > > :s :p :o . > :s a rdf:type . > << :s :p :o >> :p1 :o1; :p2 :o2 . > > is the equivalent of a neo4j assertion with two properties on its "edge". > > > What I see here is that we're also attempting to create an assignment > statement with reifiers in Turtle: > > <<(:r | :s :p :o )>> > > when this is an operation that is normally done in SPARQL: > > bind (<<:s :p :o>> as ?r) > > What we're arguing about then, to me, is a deeper question: should we have > assignment statements in Turtle? > > My gut feeling is no, for all the reasons that have become evident: > > - Inconsistency in IRI assignment for a given resource from multiple > sources > - The need to police the edge cases to negotiate cardinality > - it encourages poor modeling practices, as a reification is often a > shortcut for modeling entities that should be formally defined. > - it requires a significant set of additional semantics (predicate > additions) into rdf itself. > > Taking Peter's example: > > Liz :married-to :Dick . > :Liz :married-on "1964-03-15"^^xsd:date. > :Liz :married-to :Eddie . > :Liz :married-on "1959-05-12"^^xsd:date. > > This should be modelled as : > :m1 a :Marriage ; > :firstSpouse :Liz ; > :secondSpouse :Dick . > :startDate "1964-03-15" ; > :endDate "1959-02-17" . # Or some appropriate end-date prior to the > second marriage > :m2 a :Marriage . > :firstSpouse :Liz ; > :secondSpouse :Eddie ; > :startDate "1959-05-12" . > > We're trying to turn <<:Liz :married-to :Dick>> into a semantic carrying > vehicle, when we're better off declaring a formal structure. > > Relate this back to the Neo4J modelling. When I say: > > Liz -- marriedTo --> Dick [startDate "1964-03-15",endDate "1959-02-17"] . > Liz -- marriedTo --> Eddie [startDate "1959-05-12"] . > > What we're actually doing is hiding a lot of implicit semantics: > Liz -- marriedTo --> Dick > is actually multiple assertions: > There exists an implicit marriage M1 > M1 is a marriage entity between Liz and Dick. > It is the (implicit) marriage M1 that is being annotated, not Liz, even if > that is what it appears to be on the surface. > There is a directional implication (which is why I have :firstSpouse, > :secondSpouse in the RDF example, even though the non-directed :spouse > would be more appropriate). > > You can argue that the RDF is uglier and more verbose, but that's because > it is also more precise. There are a lot of unstated assumptions made in > neo4J which is one reason that models created that way usually get very > ugly conceptually; the hidden semantics come back to bite you. > > RDF to a certain extent does this with blank nodes. Syntactically, the > above statements could be rendered: > [ a :Marriage ; > :firstSpouse :Liz ; > :secondSpouse :Dick . > :startDate "1964-03-15" ; > :endDate "1959-02-17" ; > ]. > [ a :Marriage . > :firstSpouse :Liz ; > :secondSpouse :Eddie ; > :startDate "1959-05-12"; > ] > > but in most cases we forget the type association. > > Contrast this: > [ :firstSpouse :Liz ; :secondSpouse :Dick ; :startDate "1964-03-15" ; > :endDate "1959-02-17" ; a :Marriage] > > with the Neo4J-esque > Liz -- marriedTo --> Dick [startDate "1964-03-15",endDate "1959-02-17"] . > > It is a little more verbose, but that's only because Neo4J is actually > treating what should be an object (:Liz, via the :firstSpouse predicate) as > a subject with no explicit semantics. It is using the property marriedTo as > a carrier of type or class, without formally making that assertion anywhere. > > Put another way, Neo4J works because it makes naive assumptions, and it > gets into trouble because those naive assumptions don't survive complex > modeling. > > This is why we have to be careful about reification, because it does hide > those semantics. > > Put another way: << :Liz :married :Dick >> :startDate "1964" ; etc. is > explicitly: > > [ a rdfs:Class; rdf:subject :Liz; rdf:object :Dick; rdf:predicate > :married] :startDate "1964" ; etc. > > with the big assumption that rdf predicate :married can be used to derive > the fact that the class involved is a marriage. It is low value semantics > and a potentially dangerous shortcut for doing proper modelling, but it > gets you to Neo4J equivalence. > > I believe we are arguing at this point because we're looking at Neo4J > without recognizing that their semantics are ill-defined and somewhat > deceptive, and we're trying to satisfy ease of use at the expense of that > precision. > > So what about the general annotation space where I have two individuals > trying to create annotations on a given statement? This, to me, is THE use > case for reification. > > <<:m1 :startDate "1964">> rdf:annotate [ > a :Annotation ; > Annotation:source "https://www.example.com/LizMarriageArticle.html"; > Annotation:fromDate "2024-01-03" ; > Annotation:by janeDoe@gmail.com ; > ], [...]. > > In this case we ARE annotating an RDF statement (there is probably a > different term used here than rdf:annotate, but the idea should be the > same). > > > > > > > > > > > > > > > > > > > > > > > > > > > If you want > > > > > > > > > > > *Kurt Cagle* > Editor in Chief > The Cagle Report > kurt.cagle@gmail.com > 443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725> > > > On Fri, Apr 12, 2024 at 7:44 AM Peter F. Patel-Schneider < > pfpschneider@gmail.com> wrote: > >> Yes, but there appears to be an irreconcilable difference here. >> >> >> >> The situation with quoted triples is actually no different from any other >> case >> where some pieces of information about a resource need to be kept >> together. >> For example: >> >> :Liz :married-to :Dick . >> :Liz :married-on "1964-03-15"^^xsd:date. >> :Liz :married-to :Eddie . >> :Liz :married-on "1959-05-12"^^xsd:date. >> >> suffers from exactly the same problem as >> >> << :Liz :spouse :Dick >> :ceremony-location :Montreal. >> << :Liz :spouse :Dick >> :ceremony-date "1964-03-15"^^xsd:date. >> << :Liz :spouse :Dick >> :ceremony-location :Chobe. >> << :Liz :spouse :Dick >> :ceremony-date "1975-10-10"^^xsd:date . >> >> In both cases there need to be extra resources added for accurate >> modelling. >> >> peter >> >> >> On 4/12/24 10:00, Niklas Lindström wrote: >> > On Fri, Apr 12, 2024 at 2:57 PM Peter F. Patel-Schneider >> > <pfpschneider@gmail.com> wrote: >> >> >> >> It seems that the WG is at an impasse. >> > >> > I think we're "just" not in agreement about whether the cardinality of >> > rdf:reifies should conceptually be one or many. Some claim it makes >> > sense, others claim that it deviates from the notion of a reified >> > statement, taken as a "direct relationship instance" (which I presume >> > is what an LPG edge is taken to "denote" in the OneGraph >> > harmonization). >> > >> > It is an important question, since the motivation is to not add >> > something which is then unnecessarily (or by default) used in >> > nonsensical ways, or opens up for accidental complexity. This avoids >> > necessary remodeling if new details crop up, and/or B) integration >> > with data from other sources. >> > >> >> How about reverting to an old situation where there are no reifiers at >> all, >> >> just quoted triples, and require users to stand off from the triple as >> required? >> > >> > That depends on whether or not the syntaxes allow them (or worse, >> > encourage them) to be used as subjects (opening up for the seminal >> > error). We came to the proposal of only using them with reifiers since >> > that's when they work with use cases as-is. I.e., we have agreed that >> > this (talking about bare triple terms) is not what use cases call for >> > (not the least of which are the Amazon Neptune use cases with multiple >> > edges [1]), and makes no sense if used as is in all but the most >> > model-theoretical domains of discourse (including for token >> > provenance; the most obvious kind of occurrences-not-types). >> > >> > /Niklas >> > >> > [1]: >> https://lists.w3.org/Archives/Public/public-rdf-star/2021Dec/att-0001/rdf-star-neptune-use-cases-20211202.pdf >> > >> > >> >> peter >> >> >> >> >> >> >> >>
Received on Friday, 12 April 2024 19:43:14 UTC