Selective Transparency of Claims from Niklas Lindström on 2024-01-09 (public-rdf-star-wg@w3.org from January 2024)

From: Niklas Lindström <lindstream@gmail.com>
Date: Tue, 9 Jan 2024 11:04:24 +0100
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <CADjV5jfSTsVqaHHSCSEUPy8CEhsW7VGjRAA6HDz-mzUKb5oA=Q@mail.gmail.com>
In the now ongoing proposal [1], a triple itself, as an abstract
object, needs to be referentially opaque (like in the CG report), in
the sense that it is not known which interpreted meaning it has (it is
not *in* the graph).

With named triples this must also by default be the case.  This is
clearly so, since otherwise, if this:

    <x> | <s> <p> <o> .
    <y> | <s> <p> <o2> .
    <o> owl:sameAs <o2> .

would also imply:

    <x> owl:sameAs <y> .

then, obviously, anything stated about `<x>` is also stated about
`<y>`, and vice versa. Thus, claims cannot be transparent and remain
distinct under such entailment.

Pierre-Antoine also noted this before [2], here simply illustrated by:

    <x> | <s> <p> <o> .
    <y> | <s> <p> <o> .

along with the tautological:

    <o> owl:sameAs <o> .

which with a transparent default would entail:

    <x> owl:sameAs <y> .

and thus every name of a triple would be the *same* claim.

But given the use cases, we also need *some* claims to be transparent.
What gives?

Conceptually, a named triple is an occurrence whose name denotes a
claim, and again this triple is not necessarily asserted in the graph.
Two claims of the same triple, denoted by two different names, need to
be distinct. Thus they need to be opaque, unless otherwise stated. But
the meaning of the claim should still be interpretable, so that we can
reason about it (e.g. determine if we believe it).

Let's step back from OWL for a moment.

Without that, `owl:sameAs` is just a relationship. We just need claims
to be queryable to be able to find that they talk about the same
resources (i.e. resources linked via `owl:sameAs`). And this is
already the case if SPARQL is extended to support querying for named
occurrences, along the lines of:

    SELECT ?claim ?p ?o {
        << ?claim | <s> <p> ?x >> ?p ?o .
        ?x owl:sameAs* <o2> .
    }

But we can also pave the path for conditional, full transparency under
OWL entailment, through a simpler entailment.

If an entailment is defined that make these:

    <x> | <s> <p> <o> .
    <y> | <s> <p> <o> .

entail:

    <x> rdf:subject <s> ;
        rdf:predicate <p> ;
        rdf:object <o> .

    <y> rdf:subject <s> ;
        rdf:predicate <p> ;
        rdf:object <o> .

That would both preserve the distinction between `<x>` and `<y>` and
still make their meaning transparent in the simplest sense (through
good old reification). (Intuitively, the *parts* of a claim described
but not asserted in a graph are interpreted. It still preserves a
distinct identity, and the triple it names is just what it is.)

(See the 2005 Named Graphs paper by Carroll et al [3] for how such an
entailment can be defined (in section 3.3 "RDF Reification", about
*named triples*). We could call this TS-entailment, for
triple-statement-entailment.)

And under OWL entailment, that is also enough to, *if* so desired,
define `<x>` and `<y>` as denoting the *same* resource, simply by
typing them with a class defined as:

    rdfs:Fact a owl:Class ;
        owl:hasKey (rdf:subject rdf:predicate rdf:object) .

(ISSUE: I'm not sure if we can define a new RDFS term using OWL
semantics? Given that RDFS itself is defined as an `owl:Ontology`, I
presume it would be OK.)

Conceptually, this is then not just the claim of, but the actual
relationship it expresses, as a fact, in an interpretation of the
graph it is asserted in.

(Furthermore, with this, the notion of transparency-enabling
properties could be achieved simply by defining their `rdfs:domain` to
be an `rdfs:Fact`.)

The notion of opacity/transparency as occurrence equality is only
relevant under an entailment that interprets `owl:sameAs`. Granted,
the above also requires support for `owl:hasKey` [4], so it does
require more of OWL to achieve "full transparency" as in "these names
denote the same fact". This isn't necessarily much of a problem
though, as querying on the entailed reification might be direct enough
in practice in limited entailment environments. Specifically, given:

    <x> | <s> <p> <o> .
    <y> | <s> <p> <o2> .

    <x> <p2> <a> .
    <y> <p3> <b> .

    <o> owl:sameAs <o2> .

With just TS- and `owl:sameAs`-entailment we would also get:

    <x> rdf:subject <s> ;
        rdf:predicate <p> ;
        rdf:object <o> , <o2> .

    <y> rdf:subject <s> ;
        rdf:predicate <p> ;
        rdf:object <o2> , <o> .

And thus a query for "What is said about claims concerning `<o2>`?":

    SELECT ?claim ?p ?o {
        ?claim rdf:subject | rdf:object <o2> .
        ?claim ?p ?o .
    }

would yield:

    | ?claim | ?p | ?o |
    | <x> | <p2> | <a> |
    | <y> | <p3> | <b> |

Even though without full OWL support we lack means for entailing that
these are the same claim, the above appears workable. (Note that here,
these are *not* stated as the same! They have to be of type
`rdfs:Fact` for that to be true, per the above definition of that
using owl:hasKey.)

(ISSUE: Note the use of the AlternativePath operator in the query. I
think it is uncomfortably close to the currently suggested "naming
operator", so I really think it should change. E.g. to `:=` or `|-`
(but not to `:-`, the old iso operator from N3 [6]). But that's for a
separate thread on syntax.)

Best regards,
Niklas

[1]: <https://github.com/w3c/rdf-star-wg/wiki/Triple%E2%80%90Edge-subgroup-proposals>
[2]: <https://github.com/w3c/rdf-star/issues/200#issuecomment-1091408888>
[3]: <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3199260>
[4]: <https://www.w3.org/TR/2012/REC-owl2-rdf-based-semantics-20121211/#Semantic_Conditions_for_Keys>
[5]: <https://www.w3.org/TR/sparql11-query/#propertypaths>
[6]: <http://infomesh.net/2002/notation3/#iso>
Received on Tuesday, 9 January 2024 10:04:57 UTC