Re: An outline of RDFn -- RDF with (auto- and custom-) names from Thomas Lörtsch on 2023-11-28 (public-rdf-star-wg@w3.org from November 2023)

From: Thomas Lörtsch <tl@rat.io>
Date: Tue, 28 Nov 2023 12:42:20 +0100
To: Gregg Kellogg <gregg@greggkellogg.net>
Cc: Souripriya Das <SOURIPRIYA.DAS@oracle.com>, RDF-star WG <public-rdf-star-wg@w3.org>
Message-Id: <88724C37-7A44-47F6-AD26-6EDEA45DFDB7@rat.io>
> On 28. Nov 2023, at 00:00, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Nov 26, 2023, at 7:08 PM, Souripriya Das <SOURIPRIYA.DAS@oracle.com> wrote:
>> 
>> Since I did not hear any comments on RDFn during the first half of our last meeting that I was able to attend (except, maybe, Gregg might have said something right at the beginning but I had audio issues on my side), I thought it may be helpful to mention below a few high-level points about RDFn and how it is related to RDF-star concepts and syntax: ("statement" here simply means "a triple or quad"):
> 
> 
> I did bring it up, at least conceptually, in that it defined a syntax similar to triple terms, where those triples can be identified individually. This, of course, is not RDFn, but it triggered my thoughts.
> 
> When looking at graph term solutions, there was the notion that graphs may also need to be identified (when distinct from named graphs), to distinguish one graph with a set of triples from another. In this sense, both the graph terms and triple terms could be considered to be tokens, rather than types, given the potential for different identifiers. In the case of graph terms, the thought was that a graph may have an internal identifier (similar to a blank node identifier) that allowed the same graph to be referred to elsewhere within a given serialization. The identifiers would have no meaning at the layer of the abstract syntax. My reasoning was that a triple might also have an identifier for use within a given serialization, but would not have one in the abstract syntax. I’ve suggested a hypothetical syntax for this, and it would seem to be necessary for N-Triples/Quads in the case of graph terms.

It seems to me that you are speaking about identifiers for the type, not a particular token, e.g. an identifier that could be a hash of the triple. That would be in line with RDF-star, but still one step short of the token identifier that RDFn provides. It’s that token identifier that is actually needed in almost all use cases, and it's important not to confuse the two.

>> On Nov 27, 2023, at 4:40 AM, Olaf Hartig <olaf.hartig@liu.se> wrote:
>> 
>> How do you know that RDFn is about tokens? I have not seen Souri making
>> any explicit statements in this direction.
>> 
>> Also, it is not correct to say that "both approaches add a fifth
>> element to the subject, predicate, object and graph that we already
>> have."  RDF-star does not add a fifth element. Strictly speaking, RDF-
>> star does not even have "graph" as a fourth element--there is no notion
>> of a quad in the abstract syntax of RDF-star (and neither is there any
>> such notion in the abstract syntax of RDF). Instead, RDF-star is about
>> i) triples (which may be nested),
>> ii) graphs as sets of such triples, and
>> iii) datasets as collections of (IRI/bnode, graph) pairs, with an
>> additional graph called the default graph.
>> That is all there is in RDF-star. Adding "a fifth element" (as RDFn
>> seems to do) requires extending the abstract syntax with additional
>> concepts, and that's why "RDFn = RDF-star" is not true.
> 
> I think adding a graph component to a tuple does set RDFn apart, and it’s not something that I’ve seen a proper motivation for in use cases. It’s certainly not something we’ve discussed adequately as a group, and I haven’t seen other proposals that would motivate this, other than potentially Thomas’s “Nested Graphs”. But, in my mind, the nesting comes from triples in a graph referring to another graph by sharing subject or object with the graph name, rather than some new terminology in the abstract syntax. I don’t think this warrants a quintuple concept, though.

CG and WG have indeed not talked enough about graphs, but (named) graphs are very much a reality on the semantic web and must be taken into account. As a fact of life, any solution that only addresses triples becomes a quin solution, because graphs aren’t going anywhere. Also RDF-star can’t deny that it’s deployed in a world full of graphs. Any such trile-centric solution would be well advised to clarify the relation to graphs. Certainly an RDF WG, being responsible for the whole standard, has the obligation to clarify how the different parts are expected/assumed/designed to work together. Otherwise we are just polluting the standard with competing sub-standards. And make no mistake: even RDFn, although addressing quins, doesn’t provide much guidance when to annotate the triple or the graph and how the two approaches are meant to interact.

I talked to a prominent ex-RDF-editor recently who said that in their company they use graphs for system management and quoted triples for user-space annotations. That is a possible answer. Of course it has it’s problems too: if we recommended such practice we would invalidate (or "disrespect", as Ted harshly put it) any sound application of named graphs out there - and I’m sure the majority of them is rather sound, despite those free-wheeling or early days applications that follow their own home-grown semantics. There’s also the practical problem that in user pace there exists a need for the groupig of triples too. It’s along such lines that I’ve come to the conclusion that we should focus on named graphs (and virtually nest them to make their usage flexible enough).

> For my part, though, I think we’re lost in examining multiple “solutions” and we’ll never converge at the rate we’re going. I think the time has some to the minimum requirements of a system we can agree on and move forward; we can come back later (time and energy available) to layer on my components such as triple terms or graph terms in the abstract syntax. I think that means taking a straw poll to see who could live with RDF 1.1 reification plus syntactic sugar in N-Triples/Quads/Turtle/TriG. I’d love to explore solutions that involve graph terms/tokens, but we should probably do first things first.

I think the WG is just starting - finally - to talk about the fundamental issues: types or tokens, singletons or sets, quoted as default or extension. Many did let themselves for years get fooled by the apparent simplicity of RDF* and hoped for a straightforward standardization of a straightforward proposal. Now that it dawns on some that things just aren’t that easy it would be a pity if we gave up again, like with RDF 1.1.

That said, I would agree that it’s better to standardize some syntactic sugar for RDF standard reification than to standardize RDF-star triple terms. However that would require also some syntactic sugar in SPARQL to actually be useful. 

Also, I would favor graph terms over triple terms very much. Graph terms IMO can actually be useful. If they were accompanied by a syntactically straightforward way to assert them we would, through the backdoor (or backwardly), arrive at the nested graph proposal. I will illustrate that in more detail when the time comes.

Anyway, I would very much favor that we continue the discussion about what we actually need - referentially transparent tokens of graphs, IMHO - without that discussion always being tainted by the fact that it might not end well for the RDF-star proposal (as it was for years now). Based on such a discussion we may be able to meaningfully decide if we want to standardize some syntactic sugar or something more elaborate, or defer the practical work to a future WG. In that context: if the most pressing issue is RDF-LPG interoperability (as it was when thsi work started) then IMO singleton properties are more promising than syntactic sugar for RDF standard reification, but that should be discussed in more detail.

Anyway, I appreciate very much that finally our discussions are not so focused anymore on if something is pro or contra RDF-star, but about what we need in a more general sense. However, if we misunderstand that discussion as "what can we live with" - i.e. if we reduce it to a purely tactical search for a common least denominator - we would again miss the mark.

Thomas

> Gregg
>
Received on Tuesday, 28 November 2023 11:42:37 UTC