Just well-formed statement tokens? from Niklas Lindström on 2024-05-13 (public-rdf-star-wg@w3.org from May 2024)

From: Niklas Lindström <lindstream@gmail.com>
Date: Mon, 13 May 2024 23:48:39 +0200
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <CADjV5jeYHUbfLJs8gMA1oekhDZxQpJJG_Hxey-a3fY-_yXk=pA@mail.gmail.com>

Dear all,

I think we should compare two quite different choices.

Either we continue with real reification (i.e. reifiers/truth-makers,
being many-to-many), *or* build everything on a strong notion of
well-formedness. With the latter, we can use RDF properties to build
up composites. RDF lists are a precursor for that, so are the
statement tokens of classic reification.

On the table we just got back opaque quotation, requiring a syntactic
functional well-formedness for an rdf:statementOf property with a
range of rdf:TripleLiteral (or something to that effect). That does
not work for transparency-requiring use cases; nor for anything using
bnodes (unless some contortions are possible). It could be more useful
to go for a "fortified" option 1: a semantic functional
well-formedness for rdf:subject, rdf:predicate and rdf:object,
comprising a statement token resource. (Noting that RDF lists have
worked in OWL-aware environments through "good behaviour" alone; and
without any kind of opacity. It seems entailed triples aren't often
accidentally put back into the same graph?) Adding a strong enough
notion of well-formedness should pave the path for more efficient
implementations of both statements and lists.

Of course, neither of these statement tokens can be our "reifiers".
Statement-only proponents argue that for those we should link from
statement tokens to concrete entities (marriages, publications, etc.).
It is important to recognize that this requires all qualification use
cases (those who talk about something more concrete behind the
abstract statement) to take into account that these are decidedly
statement token resources, nothing more. For LPG edge data, which is
decidedly flat, indirect properties or informal "punning" on a
conflation of statement and qualification would probably be the only
practical way. This is the drawback of any kind of "just about
statements" approach (including singleton properties).

But there is no need for triple terms if we choose this route.
Well-formedness is enough, and "data integrity and interoperability
expectations are off" if you break it. RDF-star implementations have a
lead in optimization for storage and querying, as long as they expose
them as token resources with rdf:subject, rdf:predicate and rdf:object
(as "virtual" predicates of these internal triple terms). It's
certainly true that the composites aren't atomic in triple streams,
but the names of the composites could be "tagged as owned" as part of
the well-formedness rules.

At least this is much less of a change to the existing RDF 1.1
ecosystem, keeping its abstract syntax and requiring mostly surface
work up front, while paving for efficiency by specifying
well-formedness.

Best regards,
Niklas

Received on Monday, 13 May 2024 21:49:11 UTC