- From: William Van Woensel <william.vanwoensel@gmail.com>
- Date: Thu, 19 Sep 2024 17:39:34 -0400
- To: James Anderson <anderson.james.1955@gmail.com>
- Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Yes - in ANTLR4 (with separate parser & lexer) this task of identifying tokens would be delegated to the lexer. W > On Sep 19, 2024, at 3:40 PM, James Anderson <anderson.james.1955@gmail.com> wrote: > > if you have a parser which distinguishes lexical analysis then the parsing which generates the tokens is not the same as that which consumes the tokens. > in such a system, if you require that the token allow |> have no intervening whitespace, the next immediate character suffices. > > > [this independent of any judgement as to the benefits which that syntax may or may not have.] > >> On 19. Sep 2024, at 21:29, William Van Woensel <william.vanwoensel@gmail.com> wrote: >> >> I'm no expert either :-) But when the character "|" is encountered (lookahead = 1), we don't know whether it is the end of the triple or a reifier is following; we need to look one non-space character further (lookahead = 2). >> >> I'm unsure though - I work with ANTLR4 which can no longer be used to determine the required lookahead (I believe it was possible in ANTLR3). >> >> >> W >> >>> On Sep 19, 2024, at 1:58 PM, Thomas Lörtsch <tl@rat.io> wrote: >>> >>> Hi William, >>> >>>> On 19. Sep 2024, at 19:08, William Van Woensel <william.vanwoensel@gmail.com> wrote: >>>> >>>> Hi Thomas >>>> >>>> My two cents - regarding the triple syntax, I think you make a good point re consistency and use of [ ]. >>>> >>>> However: >>>> >>>>> <| :s :p :o | :r |> :a :b . >>>> >>>> >>>> Similar to what Andy said on Enrico's proposal - if I'm not mistaken, you'd need a parser lookahead > 1 to determine whether the "|" indicates the end of the triple or the reifier. This is not the case with the tilde, as it is a different symbol. >>> >>> >>> It requires a look ahead of one character: either the pipe is followed by a '>', or the next thing must be a reifier. I’m not the parser guy but my hunch is that that’s not problematic. Andy’s comment refers to a reifier at the beginning of a term - IIUC that’s a more complicated situation to deal with. >>> >>>> (This has been on my mind ever since Gregg Kellogg pointed this issue out for our N3 work!) >>> >>> >>> Do you happen to have a pointer? >>> >>> Best, >>> Thomas >>> >>>> >>>> W >>>> >>>>> On Sep 19, 2024, at 11:57 AM, Thomas Lörtsch <tl@rat.io> wrote: >>>>> >>>>> Hi all, >>>>> I know it’s not the right time to discuss syntax, but I’d like to throw this in just so that it has been mentioned, for consideration at a later and more appropriate time. >>>>> Best, Thomas >>>>> >>>>> >>>>> >>>>> ISSUES with the current syntax, and proposals: >>>>> >>>>> 1) >>>>> Not everybody is happy with the new tilde '~' character used to demarcate an explicitly provided reification identifier. Why not stick to the pipe '|' character which is used in a few places already? Uniformity helps recognition. >>>>> >>>>> 2) >>>>> We recently discussed a syntactic mechanism to constrain acceptable reifications to instances of many-to-one relations. Why not go the full way and provide syntaxes for both many-to-one and many-to-many reifications? >>>>> >>>>> 3) >>>>> The shorthand annotation syntax uses curly brackets, although those are commonly reserved for graphs (an informal agreement for sure, but sensible). Why not demarcate annotations with square brackets, as is customary for attribution? >>>>> >>>>> 4) >>>>> The different syntactic devices - (abstract) triple terms, (unasserted) reified terms, (shorthand) annotation syntax - diverge a lot from each other syntactically, using all sorts of brackets and other special characters. Why not try to always use the pipe '|' character, uniformly hinting at an RDF-star related construct? >>>>> >>>>> >>>>> TRIPLE PROPOSAL, covering the many-to-one related issues: >>>>> >>>>> Triple term: >>>>> >>>>> :r rdf:reifies <<| :s :p :o |>> ; >>>>> :a :b . >>>>> >>>>> Reified triple term: >>>>> >>>>> <| :s :p :o |> :a :b . >>>>> <| :s :p :o | :r |> :a :b . >>>>> >>>>> >>>>> Annotation syntax for triples: >>>>> >>>>> :s :p :o [| :a :b |] . >>>>> :s :p :o | :r1 [| :a :b |] . >>>>> >>>>> >>>>> >>>>> GRAPH PROPOSAL to cover many-to-many applications: >>>>> >>>>> Graph term: >>>>> >>>>> :r rdf:reifies {{| :s1 :p :o . >>>>> :s2 :p :o |}} ; >>>>> :a :b . >>>>> >>>>> Reified graph term: >>>>> >>>>> {| :s1 :p :o . >>>>> :s2 :p :o |} [| :a :b |] . >>>>> >>>>> {| :s1 :p :o . >>>>> :s2 :p :o | :r2 |} [| :a :b |] . >>>>> >>>>> Annotation syntax for graphs: >>>>> >>>>> { :s1 :p :o . >>>>> :s2 :p :o } [| :a :b |] . >>>>> >>>>> { :s1 :p :o . >>>>> :s2 :p :o | :r2 } [| :a :b |] . >>>>> >>>>> >>>>> - Note that there is logic to this. One doesn’t have to remeber every detail, but can deduce some variants from others. >>>>> - Replacing the ~ by the | is not overly convincing visually, but at least it helps uniformity. >>>>> - The [| … |] combination might not always be as readable as {| … |}, depending on font, but it aligns with other uses of square brackets in Turtle, and follows standard practice of encoding graphs and property lists. Imagine an annotation syntax for graphs that uses { … } for the graph and {| … |} for the annotation part - that might be quite irritating. >>>>> - So it’s not all rosy, but IMHO it’s still an improvement. >>>>> >>>>> >>>>> Remark: the following two ways to define :r would be completely equivalent: >>>>> :r rdf:reifies {{| :s1 :p :o . >>>>> :s2 :p :o |}} . >>>>> :r rdf:reifies <<| :s1 :p :o |>>, >>>>> <<| :s2 :p :o |>> . >> >> > > --- > james anderson | james@dydra.com | https://dydra.com > > >
Received on Thursday, 19 September 2024 21:39:51 UTC