Re: Reified triple syntax

> On 3. Oct 2024, at 15:30, Niklas Lindström <lindstream@gmail.com> wrote:
> 
> First off, I am very grateful that there is now a syntax for named
> annotations that works and does not clash with the alternatePath
> operator in SPARQL. There were indeed long discussions; also some
> originating in the issue thread [1]. Everyone involved acted to fix a
> pressing problem and move things forward, with limited time and
> resources.

Just to say that again: I applaud the effort. I lurked into the discussion because Github notified me (and I don’t even know why - I follow Github much less closely than I probably should) but it felt to me like two bursts of activity, and both times I was too occupied with something else to chime in. Also I thought it’s good that the pipe issue that had bothered you for quite some time got resolved. 
But I also took Ora’s advice to heart that we’re not there yet to discuss syntax.  I held back on my ideas to not confuse other discussions and not open too many cans (of worms) at the same time. So non-participation in the past shouldn’t be interpreted as lack of interest. 
I would of course kinda feel ripped of my opportunity to have and participate in this discussion if its status suddenly changed from "future" to "past". I’m not intending to break things without need, but I see need, and we don’t need to make compromises as if the current state of syntax was an established state of the art already.


Okay, top-posting from now on:

- Your variant (A) below seems reasonable to me. (B) and (C) seem a bit too frugal to me. Using the tilde as the one character that differentiates RDF-star syntax from other parts seems okay, although not really pretty when combined with brackets. I’m happy that you take up the suggestion to replace {} by [] in the annotation part. That is really the issue I care about most.

- I expect more experimentation. I’m also still hoping for a response to my little  proposal [9], as it also tackles graphs in various forms (graph terms, graph occurrences, annotated graphs). We have to have that in mind, even if we don’t specify them in this round. Maybe I’ll update it to your variant (A), as the pipe has some problems w.r.t. interoperability and readability.

- I see us neither in the position to gather an exhaustive list of requierments nor to ensure validation of proposals with user studies, etc. But using that as a counter argument to any change of syntax from this point on is not valid. E.g. the concern about the use of {} for non-graph things is easy to understand, and evident, and I see no reason why it shouldn’t be addressed, even if we can’t guarantee to address (or even comprehend) all eventual issues.

- Change: the merge of issue #51 was discussed among a small group of people, culminating in a two days "sprint", and then merge. As far as I can remember it wasn’t even mentioned, let alone discussed in a WG meeting. As I said above: I applaud the initiative, since it resolved an issue with the pipe (although I took it from Andy’s comment that the issue would also have been resolved by moving the reifier to the postfix position). IMO that is all good and fine, I’m not fanatic about following process.

- But it should be clear, and hopefully has been clear to everbody involved, that this may not be the last word on the topic, and not in any way a "normative" step, but just a documentation of the current state of the art/discussion, at a reasonably complete intermediate step, and as such indeed useful to be merged for establishing a base for further deliberations. That may very well be the case, but even the possible impression that it might not have been would be reason for concern and we should try hard to avoid it. 


Best,
Thomas

[9] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html



> Of course, we do have a bunch of new syntax (with Turtle 1.2); and I
> too wonder whether the ergonomics will work out. As seen, there were
> real problems with using pipe due to its other uses in related
> syntaxes ("baggage").
> 
> And curly braces can bring graphs to mind. I've also been thinking
> about using the blankNodePropertyList form for annotations before
> ([2], [3]). But I've used annotation syntax as much as possible to get
> a feel for it in practice (in Turtle and TriG, less so in SPARQL
> (mainly to check clashes) and never really in N3). And I've tried to
> analyze uses in examples and explorations elsewhere online. I am
> unaware of (but may have overlooked) any strong sentiment of the
> current syntax being confused with graphs (other than an initial
> association). To be clear, my frame of mind includes taking into
> account uses of curly braces for other things in SPARQL (e.g. VALUES
> clauses) and the possibility of [4] becoming part of the standard.
> 
> How much further can we change things at this point? What would cause
> confusion and hamper adoption? This sets the future of Turtle/TriG and
> SPARQL, but the reality of that hinges on it being adopted; and this
> has been in the wild in various stages of experimentation and
> (risk-taking) early adopters. (Arguments can go both ways.)
> 
> I'm curious if there were any opinions "in the corridors" at TPAC
> (particularly regarding syntax)?
> 
> ## Bikeshed
> 
> *If* something could be changed, just using tilde and reuse blank node
> brackets is an option. Some variants for illustration (of different
> perspective--these are not active proposals):
> 
> A) Just replacing the `{| ... |}` delimiters is technically a small
> change. Using `[~ ... ~]`:
> 
>    <Alice> :bought <LennyTheLion> ~ <r1> [~ :date "2024-06" ~]
>            ~ <r2> [~ :date "2024-12" ~] .
> 
> B) A *simple* solution is to skip the special annotation delimiters
> and leverage the new tilde by allowing blankNodePropertyList as well.
> That would require a flat form for named annotations. Example:
> 
>    <Alice> :bought <LennyTheLion> ~ [ :date "2024-06" ]
>            ~ <r2> .
> 
>   <r2> :date "2024-12" .
> 
> Grammar:
> 
>    annotation  ::=  ('~' (iri | BlankNode | blankNodePropertyList))*
> 
> C) For naming and simultaneously embedding descriptions, the name and
> block pairing should work here too:
> 
>    <Alice> :bought <LennyTheLion> ~ <r1> [ :date "2024-06" ]
>            ~ <r2> [ :date "2024-12" ] .
> 
> Grammar:
> 
>    annotation  ::=  (
>          ('~' (iri | BLANK_NODE_LABEL) blankNodePropertyList?) |
>          ('~' (BlankNode | blankNodePropertyList))
>       )*
> 
> ## Thoughts
> 
> Note that in B and C the end of blank annotations would be
> indistinguishable from the end of regular embedded blank nodes. Of
> course, nesting lots of blank nodes is always harder to read. And as
> seen in the bigger examples, annotations bleed into the subject
> description, even with a trailing marker. But short blank annotations
> also stand out better with distinct markers, so size of the annotation
> is not the only concern.
> 
> (A flat form is what Turtle otherwise requires for named nodes. I am
> personally of the opinion [5] that this is a good thing rather than an
> oversight. For annotations, I'm a bit unconvinced but recognize that
> they are special (a name for the embedded "margin note" is different
> from naming or using blank nodes in the regular description).)
> 
> Ergonomics is hard to get right without extensive user testing
> (ideally noting which users are already habituated to particular
> forms). Any change may trip up things that have settled (it is
> necessary to take all of SPARQL into account, including its
> readability; ideally Notation 3 too [6]). We must also bear in mind
> that every syntax revision will cost a lot of collective cognitive
> effort, over an extended period of time. The lack of feedback on [1]
> (and in the mailing list) is an indication.
> 
> *If* there is a strong sentiment that the current syntax falls short
> in practice, it would be wise, but possibly hard and time-consuming,
> to gather the syntax requirements (analyse the reasoning behind
> various choices), and concisely show how alternatives would do better
> without losing out on the other requirements. Is that feasible?
> 
> Best regards,
> Niklas
> 
> PS: I recently noticed that TinySPARQL [7] uses tilde as a prefix for
> prepared statement parameters (I don't know why it doesn't just use
> the `$var` form). I don't think there's a direct clash though. How
> many such considerations can be taken (including templating systems,
> etc.)?
> 
> PPS: Syntax that initially looks very awkward (from certain
> perspectives) is a barrier, but can grow on people if it shows
> consistency. It may be irrelevant, but I'm reminded that one of the
> reasons the creator of Python (Guido van Rossum) left his role as BDFL
> was because of the aftermath of an added syntactic feature [8].
> 
> [1]: <https://github.com/w3c/rdf-star-wg/issues/116>
> [2]: <https://gist.github.com/niklasl/c22994e664663b6730613ecc1321c418#quotation-occurrences-as-blank-graphs>
> [3]: <https://gist.github.com/niklasl/94df648c0767e206456cc4857baecac0#compact-form-variants>
> [4]; <https://github.com/w3c/sparql-query/issues/147>
> [5]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0130.html>
> [6]: <https://w3c.github.io/N3/spec/#paths>
> [7]: <https://gnome.pages.gitlab.gnome.org/tinysparql/sparql-and-tracker.html#parameters-and-prepared-statements>
> [8]: <https://hub.packtpub.com/why-guido-van-rossum-quit/>
> 
> 
> On Thu, Oct 3, 2024 at 1:28 PM Thomas Lörtsch <tl@rat.io> wrote:
>> 
>> 
>> 
>>> On 28. Sep 2024, at 14:58, William Van Woensel <william.vanwoensel@gmail.com> wrote:
>>> 
>>> FWIW, my personal view - the unified pipe syntax indeed looks a bit confusing in large examples. Which part is the identifier, and which is the annotation?  IMO it is better to have a dedicated symbol for particular purposes, such as the ~ for reifier terms.
>>> 
>>> When used individually, my issue with the pipe operator is its "baggage" -  it is known and used for a different purpose. But, I think Thomas makes a good point with [ ] being used for adding details, such as an editor's note; and that { } should be reserved for graphs.
>>> 
>>> Instead of the "|" symbol, perhaps adding a "?" (i.e., [? xyz ?]) could be more suitable. To me, the symbol conveys something like "what more can be said about this statement? well...". E.g.,
>>> 
>>> ex:Ioannes_68 a crm:E21_Person ,
>>>       ex:Gender_Eunuch ~ ex:Gender_Assignment_Eunuch [? a crm:E17_Type_Assignment ;
>>>               crm:P14_carried_out_by ex:Paphlagonian_family ;
>>>               rdfs:label "Castration gender assignment" ?] ;
>>>   rdfs:label "John the Orphanotrophos" .
>>> 
>>> I don't think it would clash with the Turtle grammar, but, it could clash with the N3 variable syntax (well, not if we require a whitespace after the "?"). On a related note, the potential "baggage" of this symbol is its association with variables.
>> 
>> I agree with the concerns w.r.t. the pipe symbol, but also w.r.t. the question mark which IMO rather rule it out. I did some more experiments, and kinda liked the ¡! combination, see the UCR example [0], with <¡ … !>, [¡ … !] and ¡! as reifier prefix (and, not shown {¡ … !}, etc. [1]).
>> 
>>>> There has been long discussions about the current syntax in github issues. No one will be happy about everything in syntax discussions.
>>> 
>>> Sorry to be adding to it. This option may have already come up; if so, feel free to disregard.
>> 
>> No worries! My comment was not meant to discourage discusion, but expectations ;-)
>> 
>> Another question: is Github issue #51 [2] the right place to continue this discussion, altough the pull request has been merged? Or is there another more current Github page discussing syntax?
>> 
>> Best,
>> Thomas
>> 
>> 
>> [0] https://gist.github.com/rat10/cdf3ba60978fcdac7763d88f2ee068a2
>> [1] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html
>> [2] https://github.com/w3c/rdf-turtle/pull/51
>> 
>> 
>> 
>>> 
>>> W
>>> 
>>>> On Sep 27, 2024, at 3:43 PM, Thomas Lörtsch <tl@rat.io> wrote:
>>>> 
>>>> Hi Niklas,
>>>> 
>>>> thank you for the links! I agree that these are indeed large examples, and thank you for the effort. Still, and I know that I do sound like a measly know-all when I say this, they are still very few ;-) But it’s unreasonable to expect us to get much further with example data (and if we did, it would still not be sure that we could evaluate them properly). Syntax is in a lot of ways a matter of taste and intuition. IMO it’s important to try to stick to some principles and seemingly objective criteria, however without getting hung up on those too much ;-)
>>>> 
>>>>> On 23. Sep 2024, at 17:00, Niklas Lindström <lindstream@gmail.com> wrote:
>>>>> 
>>>>> Yes, I tried out syntax variants on the UCR examples, plus a larger
>>>>> example based on the full Wikidata description about Elizabet Taylor
>>>>> (complete with nominations, awards, spouses and nationalities (e.g.
>>>>> twice a US citizen)). For illustration, I just added a new gist with
>>>>> those updated to the new syntax:
>>>>> 
>>>>> https://gist.github.com/niklasl/c0ba767efe4816a515ad04a4db48b3e6
>>>> 
>>>> Very nice! I just converted them to my proposal from [7]:
>>>> Liz: https://gist.github.com/rat10/ddfd60afb42a8062fd7f1680ebedd022
>>>> UCR: https://gist.github.com/rat10/6c66e360c36b7d81bb3b9bc21fc16b96
>>>> 
>>>> The good news is: this is relatively easy to do :)
>>>> 
>>>> The bad news is: this reads not particularily well. In the current version (i.e. in yout gist linked above) annotation syntax seem visually better discernible from standard triples. The cost however, especially that it uses curly braces which should be reserved to graphs, is IMO too high. Seems to me like more thinking and tinkering is needed…
>>>> 
>>>> However, the difference is more pronounced in the Elizabeth Taylor example which is also in the current syntax too involved to be really readable, especially because of those excrutiatingly long identifiers. Some line breaks would certainly help but I couldn’t figure out how to introduce them automatically in a sufficiently nice way (i.e. with proper indentation).
>>>> 
>>>>> (One caveat is the last UCR example using a full list in a triple
>>>>> occurrence; also mentioned in [1].)
>>>> 
>>>> Uff. I’ll comment on that in the issue itself.
>>>> 
>>>>> (The now obsolete examples I linked to from the comments on either the
>>>>> original github issue [2] or the addressing PR [3] are at [4] and [5}.
>>>>> Of note in [2] is that pipe collided with SPARQL alternativePath in
>>>>> annotations; which this change fixed.)
>>>> 
>>>> I just read through [3] again and noticed a comment by Andy saying that "If we go postfix, then '~' vs '|' is pure choice" [6]. If that is indeed correct (I guess it hasn’t been tested thoroughly as the discussion from that point on favored the tilde) then it’s good to know. Aesthetically I find the tilde quite okay. However, I also have that urge to unify the syntactic variations, as outlined in [7], and in that respect the pipe seems better.
>>>> 
>>>> Best,
>>>> Thomas
>>>> 
>>>>> 
>>>>> Best regards,
>>>>> Niklas
>>>>> 
>>>>> [1]: <https://github.com/w3c/rdf-turtle/issues/71#issuecomment-2363703036>
>>>>> [2]: <https://github.com/w3c/rdf-star-wg/issues/116>
>>>>> [3]: <https://github.com/w3c/rdf-turtle/pull/51>
>>>>> [4]: <https://gist.github.com/niklasl/c23925f831950506fde4eb73885319cd>
>>>>> [5]: <https://gist.github.com/niklasl/1845c6bc8b37402cc9698720c2e22f88>
>>>> [6] https://github.com/w3c/rdf-turtle/pull/51#issuecomment-2256850306
>>>> [7] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Sep/0073.html
>>>>> 
>>>>> 
>>>>> On Fri, Sep 20, 2024 at 12:37 PM Andy Seaborne <andy@apache.org> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 20/09/2024 09:46, Thomas Lörtsch wrote:
>>>>>>> this is one of your typical "arguments": seems to look so wise, but is so vacuous all the same. if you think you know something that can only be seen in large examples, then show it or at least describe it in some detail. don't expect everybody to just believe in your wisdom
>>>>>> 
>>>>>> There have been examples done by Niklas on the visual impact of syntax
>>>>>> designs.
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 

Received on Thursday, 3 October 2024 18:50:33 UTC