Re: Consolidating triple/edges from Andy Seaborne on 2023-12-15 (public-rdf-star-wg@w3.org from December 2023)

From: Andy Seaborne <andy@apache.org>
Date: Fri, 15 Dec 2023 12:57:43 +0000
To: Thomas Lörtsch <tl@rat.io>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <d1a4f0f8-ca7e-4382-9e8b-441b5f3b3555@apache.org>
Thomas,

Responding to some more of your points:

On 14/12/2023 16:46, Thomas Lörtsch wrote:
> In principal I agree, although I have a few things to add and modify. Basically, if we decide to go the pragmatic route and standardize only an LPG-oriented subset of annotation functionality, we still have to be sure that we don’t block future extensions to a complete solution. That requires us to think many things through (like graphs, quotation). However, actually standardizing those other things is then not much extra work.
> 
> A few comments inline, a more coherent take at the end.
> 
>> On 12. Dec 2023, at 21:59, Andy Seaborne <andy@apache.org> wrote:
>>
>> Here is an attempt to write out the details of what I think has been said recently.
>>
>> It is addressing "publishing information about multi-edges".
>> (Ideas here are from WG members - the mistakes are mime.)
>>
>>
>> Multiple edges with the same label are handled as multiple occurrences - the predicate URI of the RDF triple is thought as a conceptual relationship - with multiple sets of annotations.
>>
>> This preserves the uniqueness of triples in a graph, and allows independent collections of assertions about a relationship. Such collections of assertions do not get entangled on merge.
> 
> Just to emphasize that there is one more important differentiation to make: does the occurrence identify a specific triple or does it refer to the abstract statement.

It is the (potential) usage of the triple. I think this was called a 
"claim" as contrasted to an "assertion" (i.e. a fact) in early RDF 
(~1.0) discussions.

Triples (abstract/type) _occur_ or "are used" in graphs and that is that 
usage/occurrance that is being annotated separately from the universal 
concept the triple represents.

There are maybe better names than "occurrence" that could name the 
concept better. Previous, "usage", or "mention" have come up.

> That has ramifications in use cases for unasserted assertions (multiple annotations, some of them meant to refer to an unasserted statement), updates (delete the statement with the annotation, or is another annotation still refering to it) etc. 

> In more abstarct terms the question is if statements are understood as types already when authoring/storing or only later when querying/reasoning. It’s essentially a question of early vs late optimization, and RDF’s set semantics - while practical, fundamental to RDF and not to be impaired - is only a result of the former, nothing holy in itself. So working around it IS okay.

In the abstract data model of RDF, it's types.

Occurrences are a resource in the domain of discourse.
They are literal-like in that they self-describe.

 > ## SPARQL sugar
 >
 > You compare the occurence-based shortcut relation to syntactic sugar 
for RDF lists, which is fine, except that querying those lists is a 
hardship. Same for RDF/XML’s syntactic support for RDF standard 
reification. Any kind of RDF syntactic sugar also needs proper support 
in SPARQL to be effective in practice.

Yes - SPARQL needs syntax support.

The Turtle syntax would be replicated in SPARQL. "Turtle with holes".

As per the CG SPARQL-star approach, the fixed form of a triple term 
means accessor functions can be defined.

> ## Graph Terms vs Named Graphs
> 
> I like Adrians example [0] of a complicated named graph based application and I’m taking that serious. However it should also be clear that triple/graph terms in the end are always stored in a way very similar to named graphs. There is just no other way in a quad based system. Triple/graph terms can be represented as named graphs, named graphs can be represented as graph terms. It’s a practical question of how to encode belonging/membership: syntactically as nested graphs, via a new term type as in RDF-star that transforms a triple into a term at the surface (but NOT in the underlying storage layer, for obvious performance reasons)

FYI Jena stores triple terms in the term table, not in named graphs.
Other systems do this as well.

>, via explicit binding relations as Niklas proposes [1] (and as Dydra implements nested graphs), etc. The main question is how to ensure that those binding relations don’t get lost in the process, but that IMHO is true for any solution. Nested graphs can be serialized to graph terms, which are just an extension of triple terms. That requires an additional en/de-coding step to fit them into an environment that reserves named graphs to its own purposes. That extra step is the price that those applications have to pay for being so particular about their use of named graphs. That’s only fair, and probably still economical for them.
> 
> 
> ## Term types vs Datatypes
> 
> The most fundamental grievance with RDF-star is the introduction of a new term type when a new datatype of type RDF/TTL would suffice.

A way in which triple terms are not literals+datatype is that that a new 
datatype would make the triple term opaque. c.f. ""^^xsd:anyURI.

     Andy

> 
> 
> Best,
> Thomas
> 
> 
> 
> [0] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0019.html
> [1] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Nov/0032.html
> 
> 
>>     Andy
>>
>> [1]
>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0024.html
>>
>> [2] https://w3c.github.io/rdf-concepts/spec/#section-triples
>>     (as of 2023-12-10)
>>
Received on Friday, 15 December 2023 12:57:51 UTC