Re: Consolidating triple/edges from Olaf Hartig on 2023-12-18 (public-rdf-star-wg@w3.org from December 2023)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Mon, 18 Dec 2023 14:33:54 +0000
To: "tl@rat.io" <tl@rat.io>
CC: "andy@apache.org" <andy@apache.org>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
Message-ID: <01b22ee89d8eafa7a432d832966107939ad0b085.camel@liu.se>
On Mon, 2023-12-18 at 00:09 +0100, Thomas Lörtsch wrote:
> On 15. Dec 2023, at 15:33, Olaf Hartig <olaf.hartig@liu.se> wrote:
> > On Fri, 2023-12-15 at 13:57 +0100, Thomas Lörtsch wrote:
> > > > On 15. Dec 2023, at 00:01, Olaf Hartig <olaf.hartig@liu.se>
> > > > wrote:
> > > > [...]
> > > > If so, what would you expect to happen if someone writes the
> > > > following?
> > > >
> > > >   :T rdfx:typeOf << :s1 :p1 :o1 >> .
> > > >   :T rdfx:typeOf << :s2 :p2 :o2 >> .
> > > >
> > > > Which triple type would :T denote in this case (if any)?
> > >
> > > In the strictly monotonic world of RDF :T would then probably
> > > refer
> > > to a little graph term.
> >
> > I don't think this works, but I also think there is no point going
> > further into this discussion for now.
>
> Not attempting to force a discussion on you, but
>
>     :X a :House, :Bird .
>
> is perfectly legal RDF. RDF is not especially safe.

Okay, I backtrack. It certainly depends on how you define the semantics
of rdfx:typeOf.

Based on how you introduced it, I was assuming that the meaning you
have in mind for it is to state that the subject of an rdfx:typeOf
triple (i.e., :T in the examples) is "a reference to the [triple] type"
captured by the triple term in the object position of the triple. Based
on this meaning, I see the two lines above as an inconsistency.

With your statement above (":T would then probably refer to a little
graph term") you are implying a different meaning of rdfx:typeOf; being
a reference to a triple term/type and being a reference to graph term
are different thing (at least, I see it that way). Moreover, if an
rdfx:typeOf triple is meant to state that the subject of this triple is
a reference to a graph term, then I would expect the object of this
triple to be that graph term (as a whole), rather than just one of the
triples that is part of the graph term.

> [...]
> > > > In other words, should the two occurances of the
> > > > subexpression  << :s :p :o >>  in the following two lines be
> > > > understood to "implicitly reference" the same token or two
> > > > different tokens?
> > > >
> > > >   << :s :p :o >> :p2 :o2 .
> > > >   << :s :p :o >> :p3 :o3 .
> > >
> > > Always a different one, and that’s indeed crucial (I pointed that
> > > out in the nested graph proposal too).
> >
> > In this case, I cannot see how it would be possible to make more
> > than one annotation statement for each token? (If you attempt to
> > answer this question based on an example, please write the example
> > either in terms of the abstract syntax or the N-Triple-star format,
> > but not in Turtle-star.)
>
> I was sloppy in this example, but I seem to remember that in the
> context of the whole mail it might have been clearer. The idea was
> (and is) that providing no identifier makes is interpreted as "i
> don’t care about the name of this) and a blank node identifier is
> created. That motivates the need to define a way to refer to the type
> (above). I just realize the type reference, given in the syntax
> above, could then be interpreted as denoting a type of occurrence -
> that would have to be explained away…
>
>     << :s :p :o >> :p2 :o2 .
>     << :s :p :o >> :p3 :o3 ;
>                    :p4 :o4         # multiple annotations only as
> trees
>                                    # if no explicit ID is provided
>
> is then the same as
>
>     << _:b1 | :s :p :o >> :p2 :o2 .
>     << _:b2 | :s :p :o >> :p3 :o3 ;
>                           :p4 :o4 .

No, I would not say that these are the same. In contrast, the first of
these two snippets of Turtle is the same as the following.

   << :s :p :o >> :p2 :o2 .
   << :s :p :o >> :p3 :o3 .
   << :s :p :o >> :p4 :o4 .

Best,
Olaf


> N-Triples-star might look like this:
>
>     _:b1 rdfx:occurrenceOf << <http://example/s> <http://example/p> <
> http://example/o> >>.
>     _:b2 rdfx:occurrenceOf << <http://example/s> <http://example/p> <
> http://example/o> >>
>     _:b1 <http://example/p2> <http://example/o2> .
>     _:b2 <http://example/p3> <http://example/03> .
>     _:b2 <http://example/p4> <http://example/04> .
>
>
> Best,
> Thomas
>
>
> > Thanks,
> > Olaf
> >
> > > The preference for types in the semantics of RDF might be
> > > characterized as early optimization: understandable for an
> > > integration focused technology, and well understood in logic.
> > > However, the unification of tokens into types risks losing
> > > context
> > > (and annotations). It can just as well be postponed to querying
> > > (DISTINCT) or to a concious data management operation (spring
> > > cleaning in the dataset). The one thing that one doesn’t want to
> > > lose
> > > when working with data is … data. So late unification of tokens
> > > into
> > > types has some merit.
> > >
> > > Best,
> > > Thomas
> > >
> > >
> > >
> > > > Thanks,
> > > > Olaf
> > > >
> > > >
> > > > > and may either provide a custom name or will be provided with
> > > > > a
> > > > > new blank node to name the reference.
> > > > >
> > > > >
> > > > > ## Syntax
> > > > >
> > > > > We should try to make the naming syntactically as uniform and
> > > > > predicatble as possible. The nested graph proposal uses a
> > > > > pair of
> > > > > square brackets [] prepending constructs to indicate the
> > > > > name. If
> > > > > a custom name is given it is entered into that pair. That
> > > > > violates the rules for [] in Turtle/TriG but seems to parse
> > > > > unambiguously.  Not providing any name syntactically and
> > > > > still
> > > > > assuming the presence of a blank node name is a bit more
> > > > > tricky.
> > > > >
> > > > >    :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
> > > > >    :liz :spouse :dick {| :start 1975; :end 1976 |} .       #
> > > > > _:id2
> > > > >
> > > > >    [] << :s :p :o >> :start 1964 ; :end 1974 .
> > > > >
> > > > > In any case: if it doesn’t parse without a prepended name,
> > > > > then
> > > > > prepend a [].
> > > > >
> > > > >
> > > > > ## Unasserted vs Asserted
> > > > >
> > > > > Why not define a property that not only references a token,
> > > > > but
> > > > > also creates the triple, e.g.:
> > > > >
> > > > >   :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
> > > > >
> > > > > mapping to
> > > > >
> > > > >    id:1 rdfx:assertionOf << :liz :spouse :dick >>
> > > > >        :start 1964; :end 1974 .
> > > > >
> > > > > instead of
> > > > >
> > > > >    id:1 rdfx:occurrenceOf << :liz :spouse :dick >>
> > > > >        :start 1964; :end 1974 .
> > > > >    :liz :spouse :dick .
> > > > >
> > > > > That way we get identifiers for each triple occurrence
> > > > > together
> > > > > with the triple being asserted - direct identification, not
> > > > > earyl
> > > > > optimization. See above why that is important.
> > > > >
> > > > > All this unasserted business may seem a bit eccentric, but
> > > > > it’s
> > > > > the key to any sort of configurable semantics like quotation
> > > > > etc.
> > > > > It therefore has huge potential - if done right.
> > > > >
> > > > >
> > > > > ## SPARQL sugar
> > > > >
> > > > > You compare the occurence-based shortcut relation to
> > > > > syntactic
> > > > > sugar for RDF lists, which is fine, except that querying
> > > > > those
> > > > > lists is a hardship. Same for RDF/XML’s syntactic support for
> > > > > RDF
> > > > > standard reification. Any kind of RDF syntactic sugar also
> > > > > needs
> > > > > proper support in SPARQL to be effective in practice.
> > > > >
> > > > >
> > > > > ## Triple terms vs Graph terms
> > > > >
> > > > > Just for completeness: all for this can easily be expanded to
> > > > > graph terms. The syntax
> > > > >
> > > > >    []{ :s :p :o. :u :v :w }
> > > > >
> > > > > is explored in the nested graph proposal.
> > > > >
> > > > >
> > > > > ## Graph Terms vs Named Graphs
> > > > >
> > > > > I like Adrians example [0] of a complicated named graph based
> > > > > application and I’m taking that serious. However it should
> > > > > also
> > > > > be clear that triple/graph terms in the end are always stored
> > > > > in
> > > > > a way very similar to named graphs. There is just no other
> > > > > way in
> > > > > a quad based system. Triple/graph terms can be represented as
> > > > > named graphs, named graphs can be represented as graph terms.
> > > > > It’s a practical question of how to encode
> > > > > belonging/membership:
> > > > > syntactically as nested graphs, via a new term type as in
> > > > > RDF-
> > > > > star that transforms a triple into a term at the surface (but
> > > > > NOT
> > > > > in the underlying storage layer, for obvious performance
> > > > > reasons), via explicit binding relations as Niklas proposes
> > > > > [1]
> > > > > (and as Dydra implements nested graphs), etc. The main
> > > > > question
> > > > > is how to ensure that those binding relations don’t get lost
> > > > > in
> > > > > the process, but that IMHO is true for any solution. Nested
> > > > > graphs can be serialized to graph terms, which are just an
> > > > > extension of triple terms. That requires an additional en/de-
> > > > > coding step to fit them into an environment that reserves
> > > > > named
> > > > > graphs to its own purposes. That extra step is the price that
> > > > > those applications have to pay for being so particular about
> > > > > their use of named graphs. That’s only fair, and probably
> > > > > still
> > > > > economical for them.
> > > > >
> > > > >
> > > > > ## Term types vs Datatypes
> > > > >
> > > > > The most fundamental grievance with RDF-star is the
> > > > > introduction
> > > > > of a new term type when a new datatype of type RDF/TTL would
> > > > > suffice. All I proposed above is readily imlpementable in the
> > > > > nested graph proposal, which does map to TriG and regular N-
> > > > > quads
> > > > > and such a datatype (and even Turtle and N-triples, but
> > > > > that’s
> > > > > another discussion).
> > > > >
> > > > >
> > > > > Best,
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > > [0]
> > > > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0019.html

> > > > > [1]
> > > > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Nov/0032.html

> > > > >
> > > > >
> > > > > >   Andy
> > > > > >
> > > > > > [1]
> > > > > > https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0024.html

> > > > > >
> > > > > > [2]
> > > > > > https://w3c.github.io/rdf-concepts/spec/#section-triples

> > > > > >   (as of 2023-12-10)
> > > > > >
Received on Monday, 18 December 2023 14:34:06 UTC