Re: Consolidating triple/edges from Olaf Hartig on 2023-12-15 (public-rdf-star-wg@w3.org from December 2023)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Fri, 15 Dec 2023 14:33:52 +0000
To: "tl@rat.io" <tl@rat.io>
CC: "andy@apache.org" <andy@apache.org>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
Message-ID: <f404ea9e6370e6a3103aa55c6432bf96300ce5b4.camel@liu.se>
On Fri, 2023-12-15 at 13:57 +0100, Thomas Lörtsch wrote:
> > On 15. Dec 2023, at 00:01, Olaf Hartig <olaf.hartig@liu.se> wrote:
> >
> > Thomas,
> >
> > Dec 14, 2023 17:50:48 Thomas Lörtsch <tl@rat.io>:
> > > [...]
> > >
> > > ## Tokens vs Types
> > >
> > > I’d like to completely turn the table on tokens vs types: a
> > > reference to the type has to explicitly address the type. A
> > > relation reciprocal to rdfx:occurrenecOf can achieve that e.g.
> > >
> > >    :T rdfx:typeOf << :s :p :o >>
> >
> > If I understand correctly, you would like the subexpression  << :s
> > :p :o >>  in the previous line to be considered as a token of the
> > triple type (:s, :p, :o)?
>
> I’m not sure where you see the difference between
> '<< :s :p :o >>'  and '(:s, :p, :o)'.

The latter was meant to be the mathematical concept of a triple
consisting of the elements :s, :p, and :o. This is how RDF triples are
represented in the abstract syntax.

The former is a string that you can write in a Turtle-star file or in
an N-Triple-star file (and perhaps later also in a Turtle v1.2 file and
an N-Triple v1.2 file, if the WG adopts the extensions of the Turtle/N-
Triple grammar as introduced in the RDF-star CG report).

Which concept(s) of the abstract syntax such a string is meant to be a
serialization of needs to be defined. The CG report defines it do be a
serialization of a quoted triple, and the CG report understands quoted
triples as types. Based on your email I was under the impression that
you wanted to propose to change these definitions, and my question was
asking for an explicit confirmation that this was indeed your
intention. I understand now that it was not.

> The triple term/quoted triple/embedded triple, syntactically
> represented as a triple within a pair of chevrons in RDF*/star, has
> always been defined as referring to the type.

Right, and that's what I was assuming you are proposing to change.

> '<< :s :p :o >>' can - absent any other information - only refer to
> the type. The property 'rdfx:typeOf' lets the reference ':T' refer to
> that abstract type, not encumbered by any earthly connotations. Not
> that I see _many_ applications of it, but there surely are _some_, so
> it’s seems prudent to have that covered.
>
> > And you would like the meaning of the previous line to be that the
> > IRI :T is meant to denote this triple type?
>
> Yes
>
> > If so, what would you expect to happen if someone writes the
> > following?
> >
> >    :T rdfx:typeOf << :s1 :p1 :o1 >> .
> >    :T rdfx:typeOf << :s2 :p2 :o2 >> .
> >
> > Which triple type would :T denote in this case (if any)?
>
> In the strictly monotonic world of RDF :T would then probably refer
> to a little graph term.

I don't think this works, but I also think there is no point going
further into this discussion for now.

> Note how Pierre-Antoine in the Lotico talk used a list of quoted
> triples to emulate a quoted graph (illustrating the application of
> RDF-star to the Superman problem). Maybe it could be interpreted
> along those lines.
>
> > > OTOH, any reference to << :s :p :o >> is defined to implicitly
> > > references a token
> >
> > Always the same token or always a different one?
>
> Always a different one!
>
> > In other words, should the two occurances of the subexpression  <<
> > :s :p :o >>  in the following two lines be understood to
> > "implicitly reference" the same token or two different tokens?
> >
> >    << :s :p :o >> :p2 :o2 .
> >    << :s :p :o >> :p3 :o3 .
>
> Always a different one, and that’s indeed crucial (I pointed that out
> in the nested graph proposal too).

In this case, I cannot see how it would be possible to make more than
one annotation statement for each token? (If you attempt to answer this
question based on an example, please write the example either in terms
of the abstract syntax or the N-Triple-star format, but not in Turtle-
star.)

Thanks,
Olaf

> The preference for types in the semantics of RDF might be
> characterized as early optimization: understandable for an
> integration focused technology, and well understood in logic.
> However, the unification of tokens into types risks losing context
> (and annotations). It can just as well be postponed to querying
> (DISTINCT) or to a concious data management operation (spring
> cleaning in the dataset). The one thing that one doesn’t want to lose
> when working with data is … data. So late unification of tokens into
> types has some merit.
>
> Best,
> Thomas
>
>
>
> > Thanks,
> > Olaf
> >
> >
> > > and may either provide a custom name or will be provided with a
> > > new blank node to name the reference.
> > >
> > >
> > > ## Syntax
> > >
> > > We should try to make the naming syntactically as uniform and
> > > predicatble as possible. The nested graph proposal uses a pair of
> > > square brackets [] prepending constructs to indicate the name. If
> > > a custom name is given it is entered into that pair. That
> > > violates the rules for [] in Turtle/TriG but seems to parse
> > > unambiguously.  Not providing any name syntactically and still
> > > assuming the presence of a blank node name is a bit more tricky.
> > >
> > >     :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
> > >     :liz :spouse :dick {| :start 1975; :end 1976 |} .       #
> > > _:id2
> > >
> > >     [] << :s :p :o >> :start 1964 ; :end 1974 .
> > >
> > > In any case: if it doesn’t parse without a prepended name, then
> > > prepend a [].
> > >
> > >
> > > ## Unasserted vs Asserted
> > >
> > > Why not define a property that not only references a token, but
> > > also creates the triple, e.g.:
> > >
> > >    :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
> > >
> > > mapping to
> > >
> > >     id:1 rdfx:assertionOf << :liz :spouse :dick >>
> > >         :start 1964; :end 1974 .
> > >
> > > instead of
> > >
> > >     id:1 rdfx:occurrenceOf << :liz :spouse :dick >>
> > >         :start 1964; :end 1974 .
> > >     :liz :spouse :dick .
> > >
> > > That way we get identifiers for each triple occurrence together
> > > with the triple being asserted - direct identification, not earyl
> > > optimization. See above why that is important.
> > >
> > > All this unasserted business may seem a bit eccentric, but it’s
> > > the key to any sort of configurable semantics like quotation etc.
> > > It therefore has huge potential - if done right.
> > >
> > >
> > > ## SPARQL sugar
> > >
> > > You compare the occurence-based shortcut relation to syntactic
> > > sugar for RDF lists, which is fine, except that querying those
> > > lists is a hardship. Same for RDF/XML’s syntactic support for RDF
> > > standard reification. Any kind of RDF syntactic sugar also needs
> > > proper support in SPARQL to be effective in practice.
> > >
> > >
> > > ## Triple terms vs Graph terms
> > >
> > > Just for completeness: all for this can easily be expanded to
> > > graph terms. The syntax
> > >
> > >     []{ :s :p :o. :u :v :w }
> > >
> > > is explored in the nested graph proposal.
> > >
> > >
> > > ## Graph Terms vs Named Graphs
> > >
> > > I like Adrians example [0] of a complicated named graph based
> > > application and I’m taking that serious. However it should also
> > > be clear that triple/graph terms in the end are always stored in
> > > a way very similar to named graphs. There is just no other way in
> > > a quad based system. Triple/graph terms can be represented as
> > > named graphs, named graphs can be represented as graph terms.
> > > It’s a practical question of how to encode belonging/membership:
> > > syntactically as nested graphs, via a new term type as in RDF-
> > > star that transforms a triple into a term at the surface (but NOT
> > > in the underlying storage layer, for obvious performance
> > > reasons), via explicit binding relations as Niklas proposes [1]
> > > (and as Dydra implements nested graphs), etc. The main question
> > > is how to ensure that those binding relations don’t get lost in
> > > the process, but that IMHO is true for any solution. Nested
> > > graphs can be serialized to graph terms, which are just an
> > > extension of triple terms. That requires an additional en/de-
> > > coding step to fit them into an environment that reserves named
> > > graphs to its own purposes. That extra step is the price that
> > > those applications have to pay for being so particular about
> > > their use of named graphs. That’s only fair, and probably still
> > > economical for them.
> > >
> > >
> > > ## Term types vs Datatypes
> > >
> > > The most fundamental grievance with RDF-star is the introduction
> > > of a new term type when a new datatype of type RDF/TTL would
> > > suffice. All I proposed above is readily imlpementable in the
> > > nested graph proposal, which does map to TriG and regular N-quads
> > > and such a datatype (and even Turtle and N-triples, but that’s
> > > another discussion).
> > >
> > >
> > > Best,
> > > Thomas
> > >
> > >
> > >
> > > [0]
> > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0019.html

> > > [1]
> > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Nov/0032.html

> > >
> > >
> > > >    Andy
> > > >
> > > > [1]
> > > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0024.html

> > > >
> > > > [2]
> > > > https://w3c.github.io/rdf-concepts/spec/#section-triples

> > > >    (as of 2023-12-10)
> > > >
Received on Friday, 15 December 2023 14:34:00 UTC