Re: a modest proposal - eliminate reifiers completely from Kurt Cagle on 2024-04-12 (public-rdf-star-wg@w3.org from April 2024)

From: Kurt Cagle <kurt.cagle@gmail.com>
Date: Fri, 12 Apr 2024 12:40:08 -0700
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: Niklas Lindström <lindstream@gmail.com>, public-rdf-star-wg@w3.org
Message-ID: <CALm0LSFt2kL6hMtDJ0ojwdD5yWrnTFmLopfsrDZsLu8AE0+mpQ@mail.gmail.com>
> It seems that the WG is at an impasse.

> How about reverting to an old situation where there are no reifiers at
all,
> just quoted triples, and require users to stand off from the triple as
required?

I was at an IA conference yesterday, and the question of reification was
raised in several different contexts. I think it's important to remember
that reification is significant primarily because it is accommodating
(syntactically) parity with a neo4j construct.

That is to say:

:s :p :o .
:s  a  rdf:type .
<< :s :p :o >> :p1 :o1; :p2 :o2 .

is the equivalent of a neo4j assertion with two properties on its "edge".


What I see here is that we're also attempting to create an assignment
statement with reifiers in Turtle:

<<(:r | :s :p :o )>>

when this is an operation that is normally done in SPARQL:

bind (<<:s :p :o>> as ?r)

What we're arguing about then, to me, is a deeper question: should we have
assignment statements in Turtle?

My gut feeling is no, for all the reasons that have become evident:

   - Inconsistency in IRI assignment for a given resource from multiple
   sources
   - The need to police the edge cases to negotiate cardinality
   - it encourages poor modeling practices, as a reification is often a
   shortcut for modeling entities that should be formally defined.
   - it requires a significant set of additional semantics (predicate
   additions) into rdf itself.

Taking Peter's example:

Liz :married-to :Dick .
:Liz :married-on "1964-03-15"^^xsd:date.
:Liz :married-to :Eddie .
:Liz :married-on "1959-05-12"^^xsd:date.

This should be modelled as :
:m1 a :Marriage ;
     :firstSpouse :Liz ;
     :secondSpouse :Dick .
     :startDate "1964-03-15" ;
     :endDate "1959-02-17" .  # Or some appropriate end-date prior to the
second marriage
:m2 a :Marriage .
      :firstSpouse :Liz ;
      :secondSpouse :Eddie  ;
      :startDate "1959-05-12" .

We're trying to turn <<:Liz :married-to :Dick>> into a semantic carrying
vehicle, when we're better off declaring a formal structure.

Relate this back to the Neo4J modelling. When I say:

Liz -- marriedTo --> Dick [startDate "1964-03-15",endDate "1959-02-17"] .
Liz -- marriedTo --> Eddie [startDate "1959-05-12"] .

What we're actually doing is hiding a lot of implicit semantics:
Liz -- marriedTo --> Dick
is actually multiple assertions:
There exists an implicit marriage M1
M1 is a marriage entity between Liz and Dick.
It is the (implicit) marriage M1 that is being annotated, not Liz, even if
that is what it appears to be on the surface.
There is a directional implication (which is why I have :firstSpouse,
:secondSpouse in the RDF example, even though the non-directed :spouse
would be more appropriate).

You can argue that the RDF is uglier and more verbose, but that's because
it is also more precise. There are a lot of unstated assumptions made in
neo4J which is one reason that models created that way usually get very
ugly conceptually; the hidden semantics come back to bite you.

RDF to a certain extent does this with blank nodes. Syntactically, the
above statements could be rendered:
[ a :Marriage ;
     :firstSpouse :Liz ;
     :secondSpouse :Dick .
     :startDate "1964-03-15" ;
     :endDate "1959-02-17" ;
     ].
[ a :Marriage .
      :firstSpouse :Liz ;
      :secondSpouse :Eddie  ;
      :startDate "1959-05-12";
      ]

but in most cases we forget the type association.

Contrast this:
[ :firstSpouse :Liz ; :secondSpouse :Dick ; :startDate  "1964-03-15" ;
:endDate "1959-02-17" ; a :Marriage]

with the Neo4J-esque
Liz -- marriedTo --> Dick [startDate "1964-03-15",endDate "1959-02-17"] .

It is a little more verbose, but that's only because Neo4J is actually
treating what should be an object (:Liz, via the :firstSpouse predicate) as
a subject with no explicit semantics. It is using the property marriedTo as
a carrier of type or class, without formally making that assertion anywhere.

Put another way, Neo4J works because it makes naive assumptions, and it
gets into trouble because those naive assumptions don't survive complex
modeling.

This is why we have to be careful about reification, because it does hide
those semantics.

Put another way: << :Liz :married :Dick >>  :startDate "1964" ; etc.  is
explicitly:

[ a rdfs:Class; rdf:subject :Liz; rdf:object :Dick; rdf:predicate :married]
:startDate "1964" ; etc.

with the big assumption that rdf predicate :married can be used to derive
the fact that the class involved is a marriage. It is low value semantics
and a potentially dangerous shortcut for doing proper modelling, but it
gets you to Neo4J equivalence.

I believe we are arguing at this point because we're looking at Neo4J
without recognizing that their semantics are ill-defined and somewhat
deceptive, and we're trying to satisfy ease of use at the expense of that
precision.

So what about the general annotation space where I have two individuals
trying to create annotations on a given statement? This, to me, is THE use
case for reification.

<<:m1 :startDate "1964">> rdf:annotate [
a :Annotation ;
     Annotation:source "https://www.example.com/LizMarriageArticle.html";
     Annotation:fromDate "2024-01-03" ;
     Annotation:by janeDoe@gmail.com ;
], [...].

In this case we ARE annotating an RDF statement (there is probably a
different term used here than rdf:annotate, but the idea should be the
same).


























If you want










*Kurt Cagle*
Editor in Chief
The Cagle Report
kurt.cagle@gmail.com
443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725>


On Fri, Apr 12, 2024 at 7:44 AM Peter F. Patel-Schneider <
pfpschneider@gmail.com> wrote:

> Yes, but there appears to be an irreconcilable difference here.
>
>
>
> The situation with quoted triples is actually no different from any other
> case
> where some pieces of information about a resource need to be kept
> together.
> For example:
>
> :Liz :married-to :Dick .
> :Liz :married-on "1964-03-15"^^xsd:date.
> :Liz :married-to :Eddie .
> :Liz :married-on "1959-05-12"^^xsd:date.
>
> suffers from exactly the same problem as
>
> << :Liz :spouse :Dick >> :ceremony-location :Montreal.
> << :Liz :spouse :Dick >> :ceremony-date "1964-03-15"^^xsd:date.
> << :Liz :spouse :Dick >> :ceremony-location :Chobe.
> << :Liz :spouse :Dick >> :ceremony-date "1975-10-10"^^xsd:date .
>
> In both cases there need to be extra resources added for accurate
> modelling.
>
> peter
>
>
> On 4/12/24 10:00, Niklas Lindström wrote:
> > On Fri, Apr 12, 2024 at 2:57 PM Peter F. Patel-Schneider
> > <pfpschneider@gmail.com> wrote:
> >>
> >> It seems that the WG is at an impasse.
> >
> > I think we're "just" not in agreement about whether the cardinality of
> > rdf:reifies should conceptually be one or many. Some claim it makes
> > sense, others claim that it deviates from the notion of a reified
> > statement, taken as a "direct relationship instance" (which I presume
> > is what an LPG edge is taken to "denote" in the OneGraph
> > harmonization).
> >
> > It is an important question, since the motivation is to not add
> > something which is then unnecessarily (or by default) used in
> > nonsensical ways, or opens up for accidental complexity. This avoids
> > necessary remodeling if new details crop up, and/or B) integration
> > with data from other sources.
> >
> >> How about reverting to an old situation where there are no reifiers at
> all,
> >> just quoted triples, and require users to stand off from the triple as
> required?
> >
> > That depends on whether or not the syntaxes allow them (or worse,
> > encourage them) to be used as subjects (opening up for the seminal
> > error). We came to the proposal of only using them with reifiers since
> > that's when they work with use cases as-is. I.e., we have agreed that
> > this (talking about bare triple terms) is not what use cases call for
> > (not the least of which are the Amazon Neptune use cases with multiple
> > edges [1]), and makes no sense if used as is in all but the most
> > model-theoretical domains of discourse (including for token
> > provenance; the most obvious kind of occurrences-not-types).
> >
> > /Niklas
> >
> > [1]:
> https://lists.w3.org/Archives/Public/public-rdf-star/2021Dec/att-0001/rdf-star-neptune-use-cases-20211202.pdf
> >
> >
> >> peter
> >>
> >>
> >>
>
>
Received on Friday, 12 April 2024 19:40:47 UTC