- From: David Robillard <d@drobilla.net>
- Date: Thu, 15 Dec 2011 14:14:28 -0500
- To: Gregg Kellogg <gregg@kellogg-assoc.com>
- Cc: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>, Gavin Carothers <gavin@carothers.name>
On Thu, 2011-12-15 at 13:20 -0500, Gregg Kellogg wrote: > On Dec 15, 2011, at 9:51 AM, "David Robillard" <d@drobilla.net> wrote: > > > On Thu, 2011-12-15 at 10:08 -0500, Gregg Kellogg wrote: > >> I believe that grammar rule [7] predicateObjectList [1] is not LL(1) and requires look ahead to know what branch to go into. For example: > > > > Turtle has never been LL(1). > > > > You need readahead for BooleanLiteral, since "true" or "false" could > > also be the start of a PrefixedName. > > Using white space to separate tokens where necessary has always been part of Turtle. Assuming this, Turtle (and SPARQL) is LL(1). I suppose you mean the parser must read a token at a time, and after reading an entire token can decide what rule applies. Fair enough, my implementation needing readahead in this case does not imply Turtle is not theoretically LL(1), my mistake. (Forgive my ignorance of common assumption/convention when using parser generators, I am assuming my feedback from having written hand-written a parser that very explicitly and directly maps to the grammar may be valuable) My issues admittedly stem from having originally implemented an earlier version of the spec that, among other things, did not separate terminals from non-terminal rules, and did not define what a "token" is at all. I guess only terminal rules define tokens and do *not* implicitly have inserted whitespace (whereas non-terminal rules are combinations of tokens which are inherently separated by whitespace). I do not see this defined in any document cited by the spec. Should it be precisely defined what constitues whitespace between tokens? There are many more unicode whitespace characters than the ws rule in the spec. > My parser [1] is LL(1). How do you deal with quotes in long string literals without readahead? -dr
Received on Sunday, 18 December 2011 12:22:35 UTC