Re: Nested descriptions in Turtle (Re: Consolidating triple/edges) from Niklas Lindström on 2023-12-23 (public-rdf-star-wg@w3.org from December 2023)

From: Niklas Lindström <lindstream@gmail.com>
Date: Sat, 23 Dec 2023 18:30:38 +0100
To: Pierre-Antoine Champin <pierre-antoine@w3.org>
Cc: Andy Seaborne <andy@apache.org>, public-rdf-star-wg@w3.org
Message-ID: <CADjV5jcxbe61ZhXYVv=VKxApXHn4e6d9BmzuBnAMhv6d07TA1w@mail.gmail.com>
Hi Pierre-Antoine,

On Fri, Dec 22, 2023 at 5:43 PM Pierre-Antoine Champin
<pierre-antoine@w3.org> wrote:
>
>
> On 21/12/2023 19:28, Niklas Lindström wrote:
>
> On Thu, Dec 21, 2023 at 5:00 PM Andy Seaborne <andy@apache.org> wrote:
>
> Have an explicit name in annotation syntax is covered by
>
>     :liz :spouse :dick {| id:1 | :start 1964; :end 1974 |} .
>
> (it's ambiguous for a lookahead of one in LL but it seems to me to be
> more consistent in style c.f. << N | :s :p :o >>
>
> Yes; but it's this that I think is inconsistent with how Turtle
> doesn't allow both BlankNode and blankNodePropertyList, but either/or.
> The explicit name for triple terms is another thing, and I see the
> need for it there (it's more like naming a graph, but really not, I
> know).
>
> I don't consider Turtle's inabiliy to nest the descriptions of named resources as an intended feature of the language, but more as an oversight. JSON-LD allows it. RDF/XML allows it, for crying out loud!

That's interesting, I always thought of this as a deliberate design
for consistency. (Akin to "There should be one obvious way to do it"
and "Flat is better than nested" from the Zen of Python.)

Of course I've also been frustrated by it at times, e.g. when working
on very hierarchical OWL ontologies or SKOS concept schemes; so I do
feel the allure. (Not to mention how in those cases I've really wanted
the "is ex:predicate of" form of N3, or at least the "^ex:predicate"
variant of SPARQL.)

But in the past I've had problems with RDF/XML serializers utilizing
that way too much, since purportedly compact, prettified RDF/XML have
become tangled trees of descriptions, sometimes with an important
resource being described deeply nested within other description
elements. And JSON-LD caters for a lot of other use cases, within a
world of JSON where, as in XML, nesting is standard practice
("complexity be damned"). So I would not equate design considerations
there with those of Turtle.

> While we are at changing the syntax of Turtle, I would be happy to add this possibility, if only for catching up with other syntaxes... but I don't want to distract us from the more pressing issues here!

Neither do I; but added syntax needs to be scrutinized to address
potential issues with it. Nor do I want it to diverge unnecessarily
from any other opportunities if the majority wants it (in spite of my
own opinions about such features).

> But anyway, I would not shy away from allowing named annotations just because

I just want to ensure a cohesive, consistent design. And if some other
parts of Turtle are up for possible future changes it would be good
not to introduce unnecessary differences.

> But I would still allow  both named and anonymous annotations.

Of course; so do I.

In any case, the proposed, combined "naming and describing" annotation
syntax doesn't appear to work in SPARQL, since the "|" is used there
in property paths for choice of predicates. So it would collide with:

    SELECT * { ?s ?p ?o {| dct:issued | dct:modified "2023" |} . }

(That means "select any triples with occurrences having been issued or
modified in 2023", and not "the occurrence denoted by dct:issued,
which must have been modified in 2023".)

Best regards,
Niklas


>   pa
>
> (There is the N3 way of allowing e.g. = to assign a name to blank
> nodes, but that's using owl:sameAs. I think you proposed := more
> recently though, in relation to graph terms (just in a github thread
> as I recall it)? I'm not necessarily in favour of it, but it's more
> generally applicable than just being able to both name and describe
> occurrences embedded in the annotation syntax. Just a thought.)
>
> Generally, keeping away from single-character { } because of the use in
> SPARQL and possible for a graph-solution is probably a good hope.
>
> Yes, those are good and important points. (Just noting that I have
> implemented my previous suggestions, using a EBNF-based PEG parser. So
> the annotation shorthand *can* use just curlies around IRI:s and any
> form of blank nodes. But I agree it's too easy to mix up with graph
> blocks.)
>
> I wonder if something like what I wrote below (inspired by [3]) might
> be more promising (meaning having another syntax for the annotation
> syntax; which I do personally like but I know its syntax has been
> criticized before (again, see [3]).).
>
> Whatever exact syntax we end up with, that design follows the regular
> "flat" Turtle design of allowing nested descriptions only for
> unlabelled blank nodes (as in blankNodePropertyList), and otherwise an
> identifier (iri or BlankNode) with a regular description of that
> occurrence.
>
> We might "bikeshed" some more going forward. Noting the old thread on
> this, particularly the actual *star* alternative [3]; not sure if this
> would work, but it could be nice:
>
>      <s1> <p1> <o1> *_:a1 .
>      _:a1 dct:source <x> .
>
> (Of course, we should land the abstract and semantics first. I just
> want to ensure it's not too late to look at syntax alterations once
> that's done.)
>
> Best regards,
> Niklas
>
> [3]: https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0027.html
>
>
> Best regards,
> Niklas
>
> [1]: https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610#proposal-rdf-star-annotation-occurrences
> [2]: https://gist.github.com/niklasl/2d02902b81e215b1795981df31927e9b
> [3]: https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0027.html
>
>
> Gregg Kellogg
> gregg@greggkellogg.net
>
> [1] https://github.com/w3c/rdf-turtle/pull/51
> [2] https://raw.githack.com/w3c/rdf-turtle/triple-term-occurance/spec/turtle-bnf.html
>
> On Dec 20, 2023, at 2:51 AM, Pierre-Antoine Champin <pierre-antoine@w3.org> wrote:
>
> Just to concur 100% with Olaf's interpretation of Andy's email.
>
> On 20/12/2023 10:26, Olaf Hartig wrote:
>
> On Tue, 2023-12-19 at 16:39 -0800, Gregg Kellogg wrote:
>
> On Dec 18, 2023, at 12:47 PM, Andy Seaborne <andy@apache.org>
> wrote:
>
> [...]
>
> So we have:
>
> Occurrence:
>    << :s :p :o >>
>    <<| N | :s :p :o >>
>
> Triple term:
>   <<( :s :p :o )>>
>
> To be clear, a Triple term would be a type, while an occurrence is a
> token?
>
> That's my reading as well. However, maybe someone with a more intimate
> understanding of the subtleties* of the notions of a token and an
> occurrence should look at this question.
>
> *https://plato.stanford.edu/entries/types-tokens/#Occ
>
> Are these fundamental in the abstract syntax? Or is the token
> considered syntactic sugar for something like [] rdfx:occurrenceOf
> <<( :s :p :o >>?
>
> When I read Andy's email, I was assuming the latter, and that's also
> what my immediate reaction would be, now that you ask this question
> explicitly.
>
> The options that I can currently think of to make tokens/occurrences an
> explicit concept in the abstract syntax, would mean that we have to add
> another new type of term or introduce some additional mathematical
> structure that the notion of an RDF graph would have to be accompanied
> with. I don't think these are very attractive options. Yet, if it
> appears that there is a use for treating tokens/occurrences in a
> special way in SPARQL (e.g., dedicated operators or build-in
> functions), then we may have to capture them explicitly in some way
> (but I don't see a need for that at the moment).
>
> Can a term contain an occurrence, or visa-versa? E.g. <<( << :s :p :o
>
> :o1 :o2 )>> or << <<( :s :p :o )>> :o1 :o2 >>?
>
> The latter is probably not particularly controversial, in particular if
> we understand expressions of the form
>
>    << :s :p :o >>
>
> as syntactic sugar as suggested in Andy's email. Then, the shorthand
>
>    << <<( :s :p :o )>> :o1 :o2 >>
>
> expands to
>
>    [] rdfx:occurrenceOf <<( <<( :s :p :o )>> :o1 :o2 )>> .
>
> (plus, the blank node in the subject of this triple would then also be
> in the subject / the object of the triple in which the shorthand is
> used).
>
> Regarding the former, i.e.,
>
>    <<( << :s :p :o >> :o1 :o2 )>>
>
> perhaps this can also be considered (and, thus, defined) as a shorthand
> notation for
>
>    <<( _:b :o1 :o2 )>>
>
> together with the addition of
>
>    _:b rdfx:occurrenceOf <<( :s :p :o )>> .
>
> into the same graph in which the shorthand is used as subject or object
> of a triple. (Note that _:b is meant to be a fresh blank node
> identifier that is not yet used in the document in which these things
> are written).
>
> Would N-Triples contain both variations, or just the triple term?
>
> I can see how supporting both variations in N-Triples maybe appreciated
> for some use cases, but it may also be confusing because it would
> diverge from the current principle that every line in an N-Triples file
> is a serialization of a single triple only.
>
> (Note that my assumption here is, again, that an expression of the form
>
>    << :s :p :o >>
>
> is really just syntactic sugar.)
>
> And, to James’s point, can you say << :s :p :o >> a <<( :s1 :p1 :o1
> )>>; if so, would this be the same as rdfx:occurrenceOf?
>
> Well, by resolving the syntactic sugar as suggested in Andy's email,
> this would expand to
>
>    _:b rdfx:occurrenceOf <<( :s :p :o )>> .
>    _:b rdf:type <<( :s1 :p1 :o1 )>> .
>
> where, again, _:b is a fresh blank node identifier. So, the predicate
> "a" (or, rdf:type) in James' triple is not necessarily the same as rdfx
> :occurrenceOf.
>
> Annotation:
>   :s :p :o {| :p :z |}
>   :s :p :o {| N | :p :z |}
> (the last one is fiddly in the grammar because simply writing in
> ABNF is ambiguous for some parsers)
>
> Presumably, an annotation is on an occurrence and not on a triple
> term/type?
>
> I assume that's what Andy is suggesting here.
>
> Best,
> Olaf
>
>
> Gregg
>
>      Andy
>
>
> <OpenPGP_0x9D1EDAEEEF98D438.asc>
>
>
Received on Saturday, 23 December 2023 17:31:13 UTC