Re: Consolidating triple/edges from Niklas Lindström on 2023-12-21 (public-rdf-star-wg@w3.org from December 2023)

From: Niklas Lindström <lindstream@gmail.com>
Date: Thu, 21 Dec 2023 19:28:05 +0100
To: Andy Seaborne <andy@apache.org>
Cc: public-rdf-star-wg@w3.org
Message-ID: <CADjV5jccfiLyhCwiBmqVS-fSg31Hf=S2wqveO9698Y5ZbrZn3g@mail.gmail.com>
On Thu, Dec 21, 2023 at 5:00 PM Andy Seaborne <andy@apache.org> wrote:
>
>
>
> On 21/12/2023 15:50, Niklas Lindström wrote:
> > On Thu, Dec 21, 2023 at 12:42 AM Gregg Kellogg <gregg@greggkellogg.net> wrote:
> >>
> >> I created a Draft PR [1] with some Turtle Grammar changes based on my interpretation of Andy’s concept. You can see the rendered version of the EBNF via GitHack [2].
> >>
> >> As I noted, the change to the “annotation” production makes it context-sensitive, as an LL(1) parser would get confused when seeing the IRI/BlankNode that could either identify the triple occurence or be a predicate annotating the annotation, which requires the parser be able to backtrack. Not really a problem for more modern parsers, but a notable divergence. Other alternatives in the grammar could eliminate this at the cost of being less intuitive.
> >
> > For naming annotation occurrences, I think it's best to allow either
> > predicateObjectList:
> >
> >      <s1> <p1> <o1> {| dct:source <x> |} .
> >
> > or iri or BlankNode. Not sure how to do that nicely; I've previously
> > suggested [1] (some use case examples at [2]):
> >
> >      <s1> <p1> <o1> {_:a1} .
> >      _:a1 dct:source <x> .
>
> Have an explicit name in annotation syntax is covered by
>
>     :liz :spouse :dick {| id:1 | :start 1964; :end 1974 |} .
>
> (it's ambiguous for a lookahead of one in LL but it seems to me to be
> more consistent in style c.f. << N | :s :p :o >>

Yes; but it's this that I think is inconsistent with how Turtle
doesn't allow both BlankNode and blankNodePropertyList, but either/or.
The explicit name for triple terms is another thing, and I see the
need for it there (it's more like naming a graph, but really not, I
know).

(There is the N3 way of allowing e.g. = to assign a name to blank
nodes, but that's using owl:sameAs. I think you proposed := more
recently though, in relation to graph terms (just in a github thread
as I recall it)? I'm not necessarily in favour of it, but it's more
generally applicable than just being able to both name and describe
occurrences embedded in the annotation syntax. Just a thought.)

> Generally, keeping away from single-character { } because of the use in
> SPARQL and possible for a graph-solution is probably a good hope.

Yes, those are good and important points. (Just noting that I have
implemented my previous suggestions, using a EBNF-based PEG parser. So
the annotation shorthand *can* use just curlies around IRI:s and any
form of blank nodes. But I agree it's too easy to mix up with graph
blocks.)

I wonder if something like what I wrote below (inspired by [3]) might
be more promising (meaning having another syntax for the annotation
syntax; which I do personally like but I know its syntax has been
criticized before (again, see [3]).).

> > Whatever exact syntax we end up with, that design follows the regular
> > "flat" Turtle design of allowing nested descriptions only for
> > unlabelled blank nodes (as in blankNodePropertyList), and otherwise an
> > identifier (iri or BlankNode) with a regular description of that
> > occurrence.
> >
> > We might "bikeshed" some more going forward. Noting the old thread on
> > this, particularly the actual *star* alternative [3]; not sure if this
> > would work, but it could be nice:
> >
> >      <s1> <p1> <o1> *_:a1 .
> >      _:a1 dct:source <x> .

(Of course, we should land the abstract and semantics first. I just
want to ensure it's not too late to look at syntax alterations once
that's done.)

Best regards,
Niklas

[3]: https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0027.html


> > Best regards,
> > Niklas
> >
> > [1]: https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610#proposal-rdf-star-annotation-occurrences
> > [2]: https://gist.github.com/niklasl/2d02902b81e215b1795981df31927e9b
> > [3]: https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0027.html
> >
> >
> >> Gregg Kellogg
> >> gregg@greggkellogg.net
> >>
> >> [1] https://github.com/w3c/rdf-turtle/pull/51
> >> [2] https://raw.githack.com/w3c/rdf-turtle/triple-term-occurance/spec/turtle-bnf.html
> >>
> >> On Dec 20, 2023, at 2:51 AM, Pierre-Antoine Champin <pierre-antoine@w3.org> wrote:
> >>
> >> Just to concur 100% with Olaf's interpretation of Andy's email.
> >>
> >> On 20/12/2023 10:26, Olaf Hartig wrote:
> >>
> >> On Tue, 2023-12-19 at 16:39 -0800, Gregg Kellogg wrote:
> >>
> >> On Dec 18, 2023, at 12:47 PM, Andy Seaborne <andy@apache.org>
> >> wrote:
> >>
> >> [...]
> >>
> >> So we have:
> >>
> >> Occurrence:
> >>    << :s :p :o >>
> >>    <<| N | :s :p :o >>
> >>
> >> Triple term:
> >>   <<( :s :p :o )>>
> >>
> >> To be clear, a Triple term would be a type, while an occurrence is a
> >> token?
> >>
> >> That's my reading as well. However, maybe someone with a more intimate
> >> understanding of the subtleties* of the notions of a token and an
> >> occurrence should look at this question.
> >>
> >> *https://plato.stanford.edu/entries/types-tokens/#Occ
> >>
> >> Are these fundamental in the abstract syntax? Or is the token
> >> considered syntactic sugar for something like [] rdfx:occurrenceOf
> >> <<( :s :p :o >>?
> >>
> >> When I read Andy's email, I was assuming the latter, and that's also
> >> what my immediate reaction would be, now that you ask this question
> >> explicitly.
> >>
> >> The options that I can currently think of to make tokens/occurrences an
> >> explicit concept in the abstract syntax, would mean that we have to add
> >> another new type of term or introduce some additional mathematical
> >> structure that the notion of an RDF graph would have to be accompanied
> >> with. I don't think these are very attractive options. Yet, if it
> >> appears that there is a use for treating tokens/occurrences in a
> >> special way in SPARQL (e.g., dedicated operators or build-in
> >> functions), then we may have to capture them explicitly in some way
> >> (but I don't see a need for that at the moment).
> >>
> >> Can a term contain an occurrence, or visa-versa? E.g. <<( << :s :p :o
> >>
> >> :o1 :o2 )>> or << <<( :s :p :o )>> :o1 :o2 >>?
> >>
> >> The latter is probably not particularly controversial, in particular if
> >> we understand expressions of the form
> >>
> >>    << :s :p :o >>
> >>
> >> as syntactic sugar as suggested in Andy's email. Then, the shorthand
> >>
> >>    << <<( :s :p :o )>> :o1 :o2 >>
> >>
> >> expands to
> >>
> >>    [] rdfx:occurrenceOf <<( <<( :s :p :o )>> :o1 :o2 )>> .
> >>
> >> (plus, the blank node in the subject of this triple would then also be
> >> in the subject / the object of the triple in which the shorthand is
> >> used).
> >>
> >> Regarding the former, i.e.,
> >>
> >>    <<( << :s :p :o >> :o1 :o2 )>>
> >>
> >> perhaps this can also be considered (and, thus, defined) as a shorthand
> >> notation for
> >>
> >>    <<( _:b :o1 :o2 )>>
> >>
> >> together with the addition of
> >>
> >>    _:b rdfx:occurrenceOf <<( :s :p :o )>> .
> >>
> >> into the same graph in which the shorthand is used as subject or object
> >> of a triple. (Note that _:b is meant to be a fresh blank node
> >> identifier that is not yet used in the document in which these things
> >> are written).
> >>
> >> Would N-Triples contain both variations, or just the triple term?
> >>
> >> I can see how supporting both variations in N-Triples maybe appreciated
> >> for some use cases, but it may also be confusing because it would
> >> diverge from the current principle that every line in an N-Triples file
> >> is a serialization of a single triple only.
> >>
> >> (Note that my assumption here is, again, that an expression of the form
> >>
> >>    << :s :p :o >>
> >>
> >> is really just syntactic sugar.)
> >>
> >> And, to James’s point, can you say << :s :p :o >> a <<( :s1 :p1 :o1
> >> )>>; if so, would this be the same as rdfx:occurrenceOf?
> >>
> >> Well, by resolving the syntactic sugar as suggested in Andy's email,
> >> this would expand to
> >>
> >>    _:b rdfx:occurrenceOf <<( :s :p :o )>> .
> >>    _:b rdf:type <<( :s1 :p1 :o1 )>> .
> >>
> >> where, again, _:b is a fresh blank node identifier. So, the predicate
> >> "a" (or, rdf:type) in James' triple is not necessarily the same as rdfx
> >> :occurrenceOf.
> >>
> >> Annotation:
> >>   :s :p :o {| :p :z |}
> >>   :s :p :o {| N | :p :z |}
> >> (the last one is fiddly in the grammar because simply writing in
> >> ABNF is ambiguous for some parsers)
> >>
> >> Presumably, an annotation is on an occurrence and not on a triple
> >> term/type?
> >>
> >> I assume that's what Andy is suggesting here.
> >>
> >> Best,
> >> Olaf
> >>
> >>
> >> Gregg
> >>
> >>      Andy
> >>
> >>
> >> <OpenPGP_0x9D1EDAEEEF98D438.asc>
> >>
> >>
> >
>
Received on Thursday, 21 December 2023 18:28:39 UTC