Re: A proposal for basing quotation and annotation upon blank graphs

Dear Dörthe,

(And Pierre-Antoine: your proposals sound great and convergent to me.
I've added some details below touching upon that.)

(Also, Andy: I hope my thinking here converges with your reasoning.
See mentions of graph terms and "specially named graphs for RDF 1.1"
in my reply below...)

On Fri, Oct 6, 2023 at 6:56 PM Doerthe Arndt
<doerthe.arndt@tu-dresden.de> wrote:
>
> Dear Niklas,
>
> I now read your proposal and apologize for not having it done earlier (I only read the second part and thought that I have seen all).

I'm just happy you read it through!

> I now realize that you are so far only playing with the idea of going towards named graphs and I think that this is interesting.

Yes. One thing I need to clarify (which Andy pointed out I was fuzzy
about): I do see the distinction of "named graphs" as "name,graph
pairs", and "graphs themselves" being "graph terms", as in
mathematical (platonic) triple sets. I just mostly focused on the
"named thing" in that pair, and explored using that for our potential
benefit. That's the "token" part, and I carelessly sometimes called
that a "graph" (perhaps "a diagram" is better, but that's just one
kind of occurrence...).

> I would like to go this direction, but I am at the same time afraid that we cannot fix a semantics for named graphs, but if we could, that would totally be the best way to go. So hopefully I am wrong? :)

I think we can define a minimal semantics, wherein named graphs *are*
pairs of *something* and a mathematical graph. And that *something* is
what we may also define quotation and annotation upon.

> I do not think that it makes sense to come up with a special semantics for blank graphs as opposed to IRI graphs since it could be that we already have blank graphs „in the wild“ which are used differently.

Yes, it may be too brittle to gamble on that alone; we reasonably need
proper graph terms somewhere in the equation. I can imagine
provisional measures for existing tools and platforms though
(attempting to build on Pierre-Antoine's proposals, and hopefully
supporting Andy's forward-thinking).

The main challenge is basically to ensure that RDF 1.2 works with
"atomic updates of a graph with quotes", which proper graph terms
should be able to support. That can be "unstarred" for RDF 1.1 systems
to *either* just blank nodes (unless too brittle), *or* through an
unstar mechanism for RDF 1.1 systems using "specially named graphs'',
reminiscent of skolemized bnodes, but comprised of the unique triple
set signature. (I outlined a recipe for that in
<https://github.com/w3c/rdf-ucr/issues/19#issuecomment-1704211540>. I
think Pierre-Antoine's rdf:tokenOf works with this (I was going for
:entails in my recent text), and a link using that may be manifested
in "quote occurrences".)

To stay within the serialization and storage confines of quads, graph
terms can be "reference-counted constants" (which as per above can be
*implemented* as specially named graphs which are removed when nothing
speaks about them). I need to think about how this relates to Andy's
proposal for making both the "tokens" and "types" accessible over the
web.

> Maybe it is easier to define the semantics of singleton named graphs, and then go that path, but I am not convinced here.

I doubt that too. It *may* be simpler to add concrete syntax for
"singleton graphs" though, since making constant terms for them is
easier (implementation-wise) without relying on full RDF C14N. Once
that is a stable Rec, a future RDF spec could build on that for full
graph quotation support.

> Depending where we go, I could even imagine to align with N3, but in N3 we partly need the opacity because we have built-in functions acting on syntactical graph representations, but even there we could try to find a work around if we really want to go that path.

I think this could be done, at least in part. Full support for "just"
graph terms may be feasible, in 1.2 or in 1.3. And opacity may be a
matter of dataset semantics controlled by symbols (e.g. types), as I
speculated upon in my text. So instead of TEPs (being something
separate just for quoted triples) this kind of behaviour could be
defined for RDF datasets rather than being left to the "whim" of
different implementations with different controls thereof. (I'm not
personally convinced of all of N3 as RDF, as I'm unsure about
variables in RDF proper. But that is certainly beyond this WG, and
requires much more consideration.)

> So, as a summary: while I disagree with some of the details — I think for example that two occurrences of graph terms <<:s :p :o>> should have the same meaning and I do not follow Enrico’s argument that this is just like modal logic here because in modal logic we deal with logical operators while we only have predicates here, but I am looking forward to the discussion and Enrico convincing me of the opposite and I see how this could be done via named graphs

I do agree that <<<s> :p <o>>> appears as a unique triple identity,
which is what I think is the "mathematical" point of view (making for
a wholesome abstract syntax, just as IRIs are references, not "proper
names", and literals denote themselves). I just can't see which
use-cases *need* to speak directly about them (to me they can be "in
the background").

> — I do agree that we should decide now whether we dare to go broader (and I see that Thomas gives a clear: yes? :) ).

I think RDF-star inadvertently has gone broader simply by introducing
"quotation", rather than just making simple syntax for reification. I
think that is powerful, but, evidently, more work to get right.

> I also think that we should take the decision to go or not to go for named graphs/N3 before discussing the details (which is difficult, because I can tell that we all enjoy detail discussions).

I'd phrase it as: let's go for graph terms, and define the relation
from "their names" in a backwards-compatible way, ensuring that we
cater for the use cases we need to support. That is not easy, but I
think we have it within reach, and I think both Pierre-Antoine and
Andy are building towards this in their recent proposals.

> Or do we only go that far that we say which of the proposed named graphs semantics would align with our RDF-star semantics following your blank graph rewriting while still defining a proper semantics for triple terms?

We may be able to work through "what named graphs are" in steps. And
RDF 1.2 need not be all steps at once. But we need to ensure that
these semantics converge and neither thwart existing usage nor block
future convergence.

So I think we need to continue to explore the graph option thoroughly,
and I think we're progressing here. Otherwise, we may have let go of
(conditional) opacity altogether and accept occurrences only. Then all
of RDF-star is *just* syntactic sugar for rdf:Statement nodes, i.e.
plain old reification. This is the simplest option, but perhaps the
weakest, least satisfactory. Almost all use cases can be covered by
it, albeit arguably in a fashion as brittle as RDF lists (unless
handled e.g. using owl:hasKey, as I pointed out in my proposal). (The
thread at <https://lists.w3.org/Archives/Public/public-rdf-star/2020Oct/0054.html>
is a good read for revisiting that option.)

Note also my section on "Unify Singleton Graphs and Reification":
<https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610#unify-singleton-graphs-and-reification>.
At some point I think we should consider if that is a feasible thing
to define, if we go this route. That *could* unify reification with
named graphs, as occurrences of singleton graph terms. (I'm just a bit
worried that defining something seemingly at odds with ZF set theory
may be trouble down the road?)

All the best,
Niklas

Received on Thursday, 12 October 2023 13:39:11 UTC