To See a Graph In a Grain of Triples

Dear all,

The following is a weekend musing, not a real proposal. If you have
the time, I hope it might interest you. I wrote this to, for fairness,
explore triple terms, just as I've been exploring named graphs. It
began as an "anti-pattern" which just kept bugging me. It is about how
you can represent entire graphs using just one triple term, in a
convoluted but complete fashion. Or...

What if graph terms were actually defined *as* triple terms?

Given my proclaimed preference for named graphs over triple terms, it
may seem odd that I suggest this. Well, I *am* still unconvinced of
either triple or graph terms being *required* for the use cases we
have. But terms appear needed for the "maths" to play out; for a sound
foundation for semantics. And my motivation is foremost to close the
gap all the way from reification, through named graphs and to triple
terms. Unification of their concepts is an alluring puzzle.

The most serious motivation is that I am still a bit worried about
singleton graphs not *really* being triples. Since, if they're not, we
may be indirecting away from LPGs and triple provenance as done in
Wikidata (as Peter also expressed worry about). With the following,
they really are, but the cost is conceptually rather high.

So here is an exercise in thinking of graph terms as triple terms. It
*may* inform the abstract syntax in some way, though I'm more keen on
working with what is currently brewing for graph terms (the quad
structure, etc.).

(Below, the prefix `rdfx:` is used as the "extension namespace for
unstable suggestions"; as proposed by Andy in [1].)

Using triple terms:

    << <s> :p 1 >> a rdfx:Triple .

... we can make pairs of two:

    <<
      << <s> :p 1 >> rdfx:union << <s> :p 2 >>
    >> a rdfx:Graph .

An rdfx:Graph is an rdfx:Triple forming the union of two triples. The
rdfx:union predicate asserts that these have a combined meaning, in
the same graph, i.e. are "transparent to each other".

... or three:

    <<
      << <s> :p 1 >> rdfx:union <<
        << <s> :p 2 >> rdfx:union << <s> :p 3 >>
      >>
    >> a rdfx:Graph .

.. four:

    <<
      << <s> :p 1 >> rdfx:union <<
        << <s> :p 2 >> rdfx:union <<
          << <s> :p 3 >> rdfx:union << <s> :p 4 >>
        >>
      >>
    >> a rdfx:Graph .

You see the pattern. *Any* graph can be encoded like that in *one* triple term:

    <<
      << <s> :p 1 >> rdfx:union <<
        << <s> :p 2 >> rdfx:union <<
          << <s> :p 3 >> rdfx:union <<
            << <s> :p 4 >> rdfx:union << <s> :p 5 >>
          >>
        >>
      >>
    >> a rdfx:Graph .

If rdfx:union is defined as not only symmetric, but as making triple
pairing fully commutative (since graphs are unordered sets), any
reordering should be entailed, e.g.:

    <<
      << <s> :p 3 >> rdfx:union <<
        << <s> :p 1 >> rdfx:union <<
          << <s> :p 5 >> rdfx:union <<
            << <s> :p 2 >> rdfx:union << <s> :p 4 >>
          >>
        >>
      >>
    >> a rdfx:Graph .

Now since the last two triple terms are *not* identical, but "as
graphs" they are, there is a structural difference; but it is
meaningless given a sufficiently precise definition of rdfx:union.
Perhaps this still invalidates the idea (and we just use graph terms
also for single triples), or perhaps a canonicalized order of the
"union sequence" can define *one* canonically structured triple term
for each possible graph.

There are two things that strike me personally as palatable here, but
which are *not* universally appreciated. First: I can now pick out the
subject, predicate and object of all of these terms, and by the
predicate immediately see if it's a graph or "just" a triple. And the
rest of the graph can be neatly traversed following the `rdfx:union`
chain. (Did I mention I actually like that RDF lists are cons cells?
On the theoretical level only of course; I regularly use arrays to
*handle* them (e.g. through JSON-LD).)

Second: since triple terms *could* be represented by just reification
(i.e. three triples), all of RDF, named graphs and even multiple
datasets can be neatly constructed from just triples. That would be
quite absurd in practise (just as representing literals as typed nodes
with a lexical value list of unicode glyphs would be a tad
meticulous), so again this just appeals to a desire for a simple,
theoretical underpinning (triples all the way down). It wouldn't be
less of an eyesore when written out:

    [] a rdfx:Graph ;
      rdf:subject [ rdf:subject <s> ; rdf:predicate :p ; rdf:object 1 ] ;
      rdf:predicate rdfx:union ;
      rdf:object [ rdf:subject [ rdf:subject <s> ; rdf:predicate :p ;
rdf:object 2 ] ;
          rdf:predicate rdfx:union ;
          rdf:object [ rdf:subject <s> ; rdf:predicate :p ; rdf:object 3 ] ] .

Aside: You can *almost* define lists using just triple terms; but of
course that's even more awkward and entangled.

    <<
      << [] rdf:first 1 >> rdf:next <<
        << [] rdf:first 2 >> rdf:next <<
          << [] rdf:first 3 >> rdf:next <<
            << [] rdf:first 4 >> rdf:next << [] rdf:first 5 >>
          >>
        >>
      >>
    >> a rdf:List .

(That was mostly to show what the recursive term definition enables,
for better or worse. This is more of the anti-pattern which made me
doubt the soundness of such terms, as it undermines the flatness of
N-Triples. It could be argued that it's odd for RDF to gain such
atomic terms before real list terms; but I won't.)

So of course, as initially mentioned, it is crucial that this way of
representing graphs using triple terms above is for conceptual
illustration only. In no way would this be a preferable way to
explicitly encode graph terms, for which we already have quads and
braces (albeit not easily in the subject position). And it may very
well be much too theoretically low-level for the abstract syntax.

I'm merely exploring the abstraction space, from named graphs being
resources linked to mathematical graph terms, to those actually being
triple terms of triple terms. The types are then "instantiated" as
occurrences (or, the types represent a mathematically formal
"meaning", just as an IRI as a reference stands for the referent). In
my work the platonic space of types is a fleeting mirage ("the light
hurts my eyes"), and I commonly deal with ("shadowy") tokens down in
the gritty space of particulars, syntaxes and code. (But a little
light is useful to shed some meaning to the landscape.)

When these graphs are related to tokens (and made ergonomic through
concrete syntaxes), we get the useful annotation and "quotation"
forms, and enable usable means to encode both detailed provenance and
"additional marginalia". That is, we can talk about these gritty
real-world occurrences, linking them, in the background, to "platonic"
terms using a predicate like rdfx:tokenOf or rdfx:entails (where the
rdfs:domain is rdfs:Resource).

Crucially, these tokens can be kept in the foreground: named graphs,
RDF-star annotations, and perhaps, I hope, the suggested quotation
dash shorthand [2] as the unasserted token companion to annotation.

(Side notes: Here is another one of Pat's thoughts from 12 years ago
in the "Graphs and Being and Time" mail thread [3], which resonates a
lot with this. I gather RDF 1.1 ended up incorporating some of it, but
we are now revisiting more of those ideas. Tangentially, I was further
inspired when I read up on Abelian groups [4]. The root of my thinking
probably stems from Typographical Number Theory by Douglas Hofstadter
[5].)

Cheers,
Niklas

[1]: https://lists.w3.org/Archives/Public/public-rdf-star/2021Mar/0049.html
[2]: https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610#proposal-new-quoted-form
[3]: https://lists.w3.org/Archives/Public/public-rdf-wg/2011Feb/0060.html
[4]: https://en.wikipedia.org/wiki/Abelian_group
[5]: https://en.wikipedia.org/wiki/Typographical_Number_Theory

Received on Sunday, 15 October 2023 10:52:36 UTC