- From: William Van Woensel <william.vanwoensel@gmail.com>
- Date: Fri, 4 Oct 2024 06:39:47 -0400
- To: Niklas Lindström <lindstream@gmail.com>
- Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
- Message-Id: <7B005471-D9A5-4D1A-98F6-C935F2AA0C0F@gmail.com>
Hi Niklas, others, FYI, what I should have mentioned in my last message: when taking the meanings (as I see it, of course) of "?" and "!" to their extreme, then the suggestion therein could flow from that. It was not my intention to wholly overhaul the current syntax, and I realize my message may have implied that - apologies. As your examples show, and Thomas mentioned today, I find the annotation syntax allows describing assertions quite elegantly (what I meant with "less disjointed"). My intent was, aside from looking good (which is a factor, of course), that the "baggage" of symbols - albeit in our outside of CS - should be considered. Just thinking out loud - and adding a bit more nuance - "?" is used for uncertainty (but, also variables, conditional expressions) ; "!" to tell the compiler that we know what we're doing, destructive operations (but, also negation), or, generally exclaiming something. Re annotations, in general writing, asterisks are used to provide extra details in footnotes (put the star in rdf-star!), and brackets [ ] for editor's notes; but they are used for wholly different purposes in CS. AFAIK "~" is often used for describing (approximate) equivalence, or correlation between variables (but, it could also mean negation...). From that viewpoint, I find it quite suitable to indicate the reifier ID with a triple term, but perhaps not for other purposes. In a similar vein, using different tokens / delimiters for different purposes allows people to more easily spot a construct, without having to check its context (are there 2 or 3 terms before; was there a tilde?). I'm a bit unsure what you mean here: > And as seen in the bigger examples, annotations bleed into the subject description, even with a trailing marker. Regarding "flat" descriptions - from Andy's original message: > I don't consider Turtle's inabiliy to nest the descriptions of named resources as an intended feature of the language, but more as an oversight. JSON-LD allows it. RDF/XML allows it, for crying out loud! IIUC, IRI property lists <https://w3c.github.io/N3/spec/#iriprplist> from N3 may fill this need for better or for worse! William > On Oct 3, 2024, at 9:30 AM, Niklas Lindström <lindstream@gmail.com> wrote: > > First off, I am very grateful that there is now a syntax for named > annotations that works and does not clash with the alternatePath > operator in SPARQL. There were indeed long discussions; also some > originating in the issue thread [1]. Everyone involved acted to fix a > pressing problem and move things forward, with limited time and > resources. > > Of course, we do have a bunch of new syntax (with Turtle 1.2); and I > too wonder whether the ergonomics will work out. As seen, there were > real problems with using pipe due to its other uses in related > syntaxes ("baggage"). > > And curly braces can bring graphs to mind. I've also been thinking > about using the blankNodePropertyList form for annotations before > ([2], [3]). But I've used annotation syntax as much as possible to get > a feel for it in practice (in Turtle and TriG, less so in SPARQL > (mainly to check clashes) and never really in N3). And I've tried to > analyze uses in examples and explorations elsewhere online. I am > unaware of (but may have overlooked) any strong sentiment of the > current syntax being confused with graphs (other than an initial > association). To be clear, my frame of mind includes taking into > account uses of curly braces for other things in SPARQL (e.g. VALUES > clauses) and the possibility of [4] becoming part of the standard. > > How much further can we change things at this point? What would cause > confusion and hamper adoption? This sets the future of Turtle/TriG and > SPARQL, but the reality of that hinges on it being adopted; and this > has been in the wild in various stages of experimentation and > (risk-taking) early adopters. (Arguments can go both ways.) > > I'm curious if there were any opinions "in the corridors" at TPAC > (particularly regarding syntax)? > > ## Bikeshed > > *If* something could be changed, just using tilde and reuse blank node > brackets is an option. Some variants for illustration (of different > perspective--these are not active proposals): > > A) Just replacing the `{| ... |}` delimiters is technically a small > change. Using `[~ ... ~]`: > > <Alice> :bought <LennyTheLion> ~ <r1> [~ :date "2024-06" ~] > ~ <r2> [~ :date "2024-12" ~] . > > B) A *simple* solution is to skip the special annotation delimiters > and leverage the new tilde by allowing blankNodePropertyList as well. > That would require a flat form for named annotations. Example: > > <Alice> :bought <LennyTheLion> ~ [ :date "2024-06" ] > ~ <r2> . > > <r2> :date "2024-12" . > > Grammar: > > annotation ::= ('~' (iri | BlankNode | blankNodePropertyList))* > > C) For naming and simultaneously embedding descriptions, the name and > block pairing should work here too: > > <Alice> :bought <LennyTheLion> ~ <r1> [ :date "2024-06" ] > ~ <r2> [ :date "2024-12" ] . > > Grammar: > > annotation ::= ( > ('~' (iri | BLANK_NODE_LABEL) blankNodePropertyList?) | > ('~' (BlankNode | blankNodePropertyList)) > )* > > ## Thoughts > > Note that in B and C the end of blank annotations would be > indistinguishable from the end of regular embedded blank nodes. Of > course, nesting lots of blank nodes is always harder to read. And as > seen in the bigger examples, annotations bleed into the subject > description, even with a trailing marker. But short blank annotations > also stand out better with distinct markers, so size of the annotation > is not the only concern. > > (A flat form is what Turtle otherwise requires for named nodes. I am > personally of the opinion [5] that this is a good thing rather than an > oversight. For annotations, I'm a bit unconvinced but recognize that > they are special (a name for the embedded "margin note" is different > from naming or using blank nodes in the regular description).) > > Ergonomics is hard to get right without extensive user testing > (ideally noting which users are already habituated to particular > forms). Any change may trip up things that have settled (it is > necessary to take all of SPARQL into account, including its > readability; ideally Notation 3 too [6]). We must also bear in mind > that every syntax revision will cost a lot of collective cognitive > effort, over an extended period of time. The lack of feedback on [1] > (and in the mailing list) is an indication. > > *If* there is a strong sentiment that the current syntax falls short > in practice, it would be wise, but possibly hard and time-consuming, > to gather the syntax requirements (analyse the reasoning behind > various choices), and concisely show how alternatives would do better > without losing out on the other requirements. Is that feasible? > > Best regards, > Niklas > > PS: I recently noticed that TinySPARQL [7] uses tilde as a prefix for > prepared statement parameters (I don't know why it doesn't just use > the `$var` form). I don't think there's a direct clash though. How > many such considerations can be taken (including templating systems, > etc.)? > > PPS: Syntax that initially looks very awkward (from certain > perspectives) is a barrier, but can grow on people if it shows > consistency. It may be irrelevant, but I'm reminded that one of the > reasons the creator of Python (Guido van Rossum) left his role as BDFL > was because of the aftermath of an added syntactic feature [8]. > > [1]: <https://github.com/w3c/rdf-star-wg/issues/116> > [2]: <https://gist.github.com/niklasl/c22994e664663b6730613ecc1321c418#quotation-occurrences-as-blank-graphs> > [3]: <https://gist.github.com/niklasl/94df648c0767e206456cc4857baecac0#compact-form-variants> > [4]; <https://github.com/w3c/sparql-query/issues/147> > [5]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0130.html> > [6]: <https://w3c.github.io/N3/spec/#paths> > [7]: <https://gnome.pages.gitlab.gnome.org/tinysparql/sparql-and-tracker.html#parameters-and-prepared-statements> > [8]: <https://hub.packtpub.com/why-guido-van-rossum-quit/> > > > On Thu, Oct 3, 2024 at 1:28 PM Thomas Lörtsch <tl@rat.io> wrote: >> >> >> >>> On 28. Sep 2024, at 14:58, William Van Woensel <william.vanwoensel@gmail.com> wrote: >>> >>> FWIW, my personal view - the unified pipe syntax indeed looks a bit confusing in large examples. Which part is the identifier, and which is the annotation? IMO it is better to have a dedicated symbol for particular purposes, such as the ~ for reifier terms. >>> >>> When used individually, my issue with the pipe operator is its "baggage" - it is known and used for a different purpose. But, I think Thomas makes a good point with [ ] being used for adding details, such as an editor's note; and that { } should be reserved for graphs. >>> >>> Instead of the "|" symbol, perhaps adding a "?" (i.e., [? xyz ?]) could be more suitable. To me, the symbol conveys something like "what more can be said about this statement? well...". E.g., >>> >>> ex:Ioannes_68 a crm:E21_Person , >>> ex:Gender_Eunuch ~ ex:Gender_Assignment_Eunuch [? a crm:E17_Type_Assignment ; >>> crm:P14_carried_out_by ex:Paphlagonian_family ; >>> rdfs:label "Castration gender assignment" ?] ; >>> rdfs:label "John the Orphanotrophos" . >>> >>> I don't think it would clash with the Turtle grammar, but, it could clash with the N3 variable syntax (well, not if we require a whitespace after the "?"). On a related note, the potential "baggage" of this symbol is its association with variables. >> >> I agree with the concerns w.r.t. the pipe symbol, but also w.r.t. the question mark which IMO rather rule it out. I did some more experiments, and kinda liked the ¡! combination, see the UCR example [0], with <¡ … !>, [¡ … !] and ¡! as reifier prefix (and, not shown {¡ … !}, etc. [1]). >> >>>> There has been long discussions about the current syntax in github issues. No one will be happy about everything in syntax discussions. >>> >>> Sorry to be adding to it. This option may have already come up; if so, feel free to disregard. >> >> No worries! My comment was not meant to discourage discusion, but expectations ;-) >> >> Another question: is Github issue #51 [2] the right place to continue this discussion, altough the pull request has been merged? Or is there another more current Github page discussing syntax? >> >> Best, >> Thomas >> >> >> [0] https://gist.github.com/rat10/cdf3ba60978fcdac7763d88f2ee068a2 >> [1] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html >> [2] https://github.com/w3c/rdf-turtle/pull/51 >> >> >> >>> >>> W >>> >>>> On Sep 27, 2024, at 3:43 PM, Thomas Lörtsch <tl@rat.io> wrote: >>>> >>>> Hi Niklas, >>>> >>>> thank you for the links! I agree that these are indeed large examples, and thank you for the effort. Still, and I know that I do sound like a measly know-all when I say this, they are still very few ;-) But it’s unreasonable to expect us to get much further with example data (and if we did, it would still not be sure that we could evaluate them properly). Syntax is in a lot of ways a matter of taste and intuition. IMO it’s important to try to stick to some principles and seemingly objective criteria, however without getting hung up on those too much ;-) >>>> >>>>> On 23. Sep 2024, at 17:00, Niklas Lindström <lindstream@gmail.com> wrote: >>>>> >>>>> Yes, I tried out syntax variants on the UCR examples, plus a larger >>>>> example based on the full Wikidata description about Elizabet Taylor >>>>> (complete with nominations, awards, spouses and nationalities (e.g. >>>>> twice a US citizen)). For illustration, I just added a new gist with >>>>> those updated to the new syntax: >>>>> >>>>> https://gist.github.com/niklasl/c0ba767efe4816a515ad04a4db48b3e6 >>>> >>>> Very nice! I just converted them to my proposal from [7]: >>>> Liz: https://gist.github.com/rat10/ddfd60afb42a8062fd7f1680ebedd022 >>>> UCR: https://gist.github.com/rat10/6c66e360c36b7d81bb3b9bc21fc16b96 >>>> >>>> The good news is: this is relatively easy to do :) >>>> >>>> The bad news is: this reads not particularily well. In the current version (i.e. in yout gist linked above) annotation syntax seem visually better discernible from standard triples. The cost however, especially that it uses curly braces which should be reserved to graphs, is IMO too high. Seems to me like more thinking and tinkering is needed… >>>> >>>> However, the difference is more pronounced in the Elizabeth Taylor example which is also in the current syntax too involved to be really readable, especially because of those excrutiatingly long identifiers. Some line breaks would certainly help but I couldn’t figure out how to introduce them automatically in a sufficiently nice way (i.e. with proper indentation). >>>> >>>>> (One caveat is the last UCR example using a full list in a triple >>>>> occurrence; also mentioned in [1].) >>>> >>>> Uff. I’ll comment on that in the issue itself. >>>> >>>>> (The now obsolete examples I linked to from the comments on either the >>>>> original github issue [2] or the addressing PR [3] are at [4] and [5}. >>>>> Of note in [2] is that pipe collided with SPARQL alternativePath in >>>>> annotations; which this change fixed.) >>>> >>>> I just read through [3] again and noticed a comment by Andy saying that "If we go postfix, then '~' vs '|' is pure choice" [6]. If that is indeed correct (I guess it hasn’t been tested thoroughly as the discussion from that point on favored the tilde) then it’s good to know. Aesthetically I find the tilde quite okay. However, I also have that urge to unify the syntactic variations, as outlined in [7], and in that respect the pipe seems better. >>>> >>>> Best, >>>> Thomas >>>> >>>>> >>>>> Best regards, >>>>> Niklas >>>>> >>>>> [1]: <https://github.com/w3c/rdf-turtle/issues/71#issuecomment-2363703036> >>>>> [2]: <https://github.com/w3c/rdf-star-wg/issues/116> >>>>> [3]: <https://github.com/w3c/rdf-turtle/pull/51> >>>>> [4]: <https://gist.github.com/niklasl/c23925f831950506fde4eb73885319cd> >>>>> [5]: <https://gist.github.com/niklasl/1845c6bc8b37402cc9698720c2e22f88> >>>> [6] https://github.com/w3c/rdf-turtle/pull/51#issuecomment-2256850306 >>>> [7] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html >>>>> >>>>> >>>>> On Fri, Sep 20, 2024 at 12:37 PM Andy Seaborne <andy@apache.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 20/09/2024 09:46, Thomas Lörtsch wrote: >>>>>>> this is one of your typical "arguments": seems to look so wise, but is so vacuous all the same. if you think you know something that can only be seen in large examples, then show it or at least describe it in some detail. don't expect everybody to just believe in your wisdom >>>>>> >>>>>> There have been examples done by Niklas on the visual impact of syntax >>>>>> designs. >>>>>> >>>>>> >>>> >>>> >>> >>> >>
Received on Friday, 4 October 2024 10:40:07 UTC