Re: Representing Named Triples as Just Triples from Niklas Lindström on 2024-01-11 (public-rdf-star-wg@w3.org from January 2024)

From: Niklas Lindström <lindstream@gmail.com>
Date: Thu, 11 Jan 2024 12:17:58 +0100
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <CADjV5jd1MAgoZRsxq4oyosZwbWCw-rt+un=i_uJ=eeSrvhw1eA@mail.gmail.com>
Further update; TL/DR: Named triples do *not* (AFAICS) have to name
one specific triple. It only "makes sense" to name multiple triples
with the same claim identifier if they are of the same reason. See
below for how that is so.

(Aside: The below could also answer some, perhaps most, of Andy's
important concerns in [1]. I went from too much in one thread to
perhaps too many threads. I want the subject title to be specific to
the content discussed; but this is hard.)

Full version:

With named triples, a graph in the data model is a set of either
asserted and/or named triples. The named triples are not asserted in
and of themselves (conceptually the names denote a *claim*), but
another occurrence of the same abstract triple can also be asserted
(conceptually that is a *fact*). There can only be one such asserted
occurrence per graph (as in RDF 1.1), but many occurrences of the same
triple paired with different names.

Conceptually, the *meaning* of a graph is the union of its *facts* and *claims*.

Addition: Thus there may also be *different triples per name*. So the
functional restriction is not necessary.

So these identically named but structurally different triples:

    <t> | <urn:x:s> <urn:x:p> <urn:x:o> .
    <t> | <urn:x:s> <urn:x:p> <urn:x:o2> .

Can be encoded as (now using rdf:value and a new datatype for the
lexical string representation of a triple):

    <t> rdf:value "<urn:x:s> <urn:x:p> <urn:x:o>"^^rdfx:lexicalTriple .
    <t> rdf:value "<urn:x:s> <urn:x:p> <urn:x:o2>"^^rdfx:lexicalTriple .

To be decided: these can either entail, or must be accompanied with,
the following reification to fulfil the condition of representing a
named triple:

    <t> rdf:subject <urn:x:s> .
    <t> rdf:predicate <urn:x:p> .
    <t> rdf:object <urn:x:o> .

    <t> rdf:subject <urn:x:s1> .
    <t> rdf:predicate <urn:x:p1> .
    <t> rdf:object <urn:x:o2> .

Conceptually, this means that the claim is about all of these triples
at once. The triples are distinct, but the meaning is claimed to be
the same (i.e. the same claim, or the same fact if that is also
asserted about `<t>`).

(Note that the reification is not "well-formed" (it is
"oversaturated", if you will), and is not in itself enough to pick
apart the original triples themselves named by `<t>`. This sets named
triples, describing claims or facts, apart from *just* reification. It
also makes them somewhat more like named graphs, but with different
characteristics, e.g. not necessarily asserted, inexorably coupled to
one graph (itself named or not), *and* reasonably only naming triples
coupled for *the right reasons*; see below.)

(This is like a literal as a resource, which can have two lexical
values meaning the same thing.)

It is also possible to check if a claim is for The Right Reasons. If
all triples are entailed from at least one triple, the others are
necessary truths if the claim is true, contingent upon at least that
one triple.

(Aside: I think this pretty much reflects what the notion of a *truth
maker* (fact) with truth bearers (triples) is.)

For instance, given:

    <a> | <elisabeth> ex:marriedTo <richard> .
    <b> | <richard> ex:marriedTo <elisabeth> .
    <c> | <elisabeth> a ex:Spouse .
    <d> | <richard> a ex:Spouse .

All of `<a>`, `<b>`, `<c>` and `<d>` *may* denote one claim (or actual
fact), for the right reasons, if:

    ex:marriedTo a owl:SymmetricProperty ;
      rdfs:domain ex:Spouse .

They *mustn't* be the same claim of course. But it makes sense to
declare them as owl:sameAs, directly or simply by stating them to be
of type rdfsx:Fact, which as defined using owl:hasKey would be
inferred to be the same under OWL entailment.

Likewise, a claim can be *nonsense* in the same way. E.g with:

    ex:Batchelor owl:disjointWith ex:Spouse .

This (as in what `<d>` denotes, *not* the named triples themselves),
would be nonsense:

    <d> | <richard> a ex:Spouse .
    <d> | <richard> a ex:Batchelor .

As would a graph which asserted them be.

But this would not necessarily be so:

    <e> | <richard> a ex:Batchelor .

It would be decidedly conflicting with `<d>` though.

Questions:
* Would an rdfx:Claim only be a *proper* rdfsx:Fact if it is for the
right reasons?
* Could a proper rdfsx:Fact be automatically asserted under some
entailment extension? Probably not, since a graph can contain sense or
nonsense, it is not up to the facts to be decidedly asserted or not...

(NOTE it might be tempting to infer named triples for all asserted
triples in a graph. But obviously that would result in an infinite,
recursive explosion of named triples, since each inferred triple would
then also entail another named triple.)

Best regards,
Niklas

[1]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0053.html>


On Thu, Jan 11, 2024 at 3:18 AM Niklas Lindström <lindstream@gmail.com> wrote:
>
> This is an update to the Named Triples as Claims variant. (Using
> terminology added to [1], and following [2].)
>
> (I've yet to fully compare it to the Occurrence Set variant Andy just
> posted [3]. They seem to have a lot of the same goals, vary in what is
> added to the abstract syntax and data model, possibly vary on
> many-to-one or many-to-many for occurrence-to-triple, and on opacity.)
>
> Named triples can be represented as just triples, through a predicate
> for the lexical triple representation.
>
> This named triple:
>
>     <t> | <urn:x:s> <urn:x:p> "l"^^<urn:x:d> .
>
> Can be encoded as:
>
>     <t> rdfx:lexicalTriple "<urn:x:s> <urn:x:p> "l"^^<urn:x:d>" .
>
> This would *only* be for use in N-triples, to keep named triples as
> just triples "over the wire". Other syntaxes supporting RDF-star would
> parse directly into named triples.
>
> (For RDF 1.2 Basic though, this does "expose" raw N-Triples also in
> other syntaxes.)
>
> These lexical forms could be dropped upon parsing as named triples.
> They might still be entailed; just as the transparent meaning of named
> triples in the graph needs to be entailed, as a regular reification,
> for this design to work in `owl:sameAs`-dependent use cases [2].
>
> Per the functional requirement, a different value of
> `rdfx:lexicalTriple`for the same subject is not allowed to produce the
> same name for a different triple. That would be the same restriction
> as for the direct syntax of named triples. But it might not be
> necessary to halt on it. The raw `rdfx:lexicalTriple` triple could be
> kept as is, along with an issued warning. (A parser option could
> control this behaviour, for use in linters, etc.)
>
> Blank nodes have been seen as problematic in unstarring, since while a
> parser can use the blank node remapping function, the mapping would
> not survive roundtripping through RDF 1.1 (nor RDF 1.2 Basic) parsers.
> But skolemization can be used to manifest the mapping, by also
> associating the bnode with the genid lexical form:
>
> So, given:
>
>     <t> | <s> <p> _:b .
>
> that would become:
>
>     <t> rdfx:lexicalTriple "<urn:x:s> <urn:x:p>
> <.../.well-known/genid/d26a2d0e98334696f4ad70a677abc1f6>"^^rdfx:lexicalTriple
> .
>
>     _:b rdfx:genid ".../.well-known/genid/d26a2d0e98334696f4ad70a677abc1f6" .
>
> (That needs not be functional, but inverse-functional.)
>
> The RDF data model with named triples would be (a variant of RDFn [4]):
>
>     graph             ::= (triple|namedTriple)*
>     triple            ::= subject predicate object
>     subject           ::= IRI | BlankNode
>     predicate         ::= IRI
>     object            ::= term
>     term              ::= IRI | BlankNode | Literal
>     namedTriple       ::= identifier triple
>     identifier        ::= IRI | BlankNode
>
> (The difference from RDF 1.1 is the addition of namedTriple, and
> allowing that in graph.)
>
> Conceptually, graphs can contain both asserted triples and claims
> thereof (as named triples). In the example above, `<t>` denotes the
> *claim* of that named triple.
>
> To query for these, SPARQL-star syntax may be required. But a regular
> reification can be entailed from a named triple, which is also needed
> to support transparency in this design.
>
> Alternatively, an actual reification could be required alongside the
> lexicalTriple form above in N-triples, and RDF-star could be sugar for
> reification with this additional data model and conceptual
> interpretation. That would *extend* reification to also support
> quotation. It would result in more triples in the raw data of course,
> but it would not require reifications to be, or stay, "well-formed".
> (They can have multiple subjects, predicates or objects, and distinct
> occurrences are still preserved, through the named triple in the
> abstract syntax, and through the lexical form in N-triples.)
>
> A claim can be stated as a rdfsx:Fact, meaning that it denotes the
> fact, i.e. the actual asserted triple in the graph. Using OWL, it is
> then restricted to that denotation (through owl:hasKey on the
> reification parts, there can only be one).
>
> Best regards,
> Niklas
>
> [1]: <https://github.com/w3c/rdf-star-wg/wiki/Triple%E2%80%90Edge-subgroup-proposals>
> [2]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0047.html>
> [3]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0050.html>
> [4]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/att-0008/RDFn_Abstract_Syntax_and_Concepts.pdf>
Received on Thursday, 11 January 2024 11:18:31 UTC