- From: Gregg Kellogg <gregg@kellogg-assoc.com>
- Date: Thu, 15 Dec 2011 13:20:05 -0500
- To: David Robillard <d@drobilla.net>
- CC: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>, Gavin Carothers <gavin@carothers.name>
On Dec 15, 2011, at 9:51 AM, "David Robillard" <d@drobilla.net> wrote: > On Thu, 2011-12-15 at 10:08 -0500, Gregg Kellogg wrote: >> I believe that grammar rule [7] predicateObjectList [1] is not LL(1) and requires look ahead to know what branch to go into. For example: > > Turtle has never been LL(1). > > You need readahead for BooleanLiteral, since "true" or "false" could > also be the start of a PrefixedName. Using white space to separate tokens where necessary has always been part of Turtle. Assuming this, Turtle (and SPARQL) is LL(1). My parser [1] is LL(1). Gregg 1: http://github.com/rdf-turtle > This is the worst case, 6 character readahead. > > Similarly, > > [9] verb ::= predicate | 'a' > > Requires a 2 character readahead (to check if the 'a' is followed by > whitespace since 'a' can start a predicate. > > In general, qualified names and keywords are ambiguous while parsing. > IMO either qualified names should have had quoting ("[foo:bar]", > perhaps), or the special keywords ("a", "true", "false") should have had > a unique prefix character, which would solve this problem and make the > grammar extensible, perhaps even 'dynamically' via a @keyword directive. > It's too late for that now, however. > > I also had to use it in my parser to correctly handle quote characters > in long string literals, since you can read up to 2 of them and have it > not terminate the string, i.e. every time you encounter a quote you must > read ahead 3 characters to determine if this is the end of the string > literal. I don't see how this could have been avoided, other than > simply making single quote strings be long literals, but this would have > meant quotes would always need escaping in a string literal. > > -dr > >
Received on Thursday, 15 December 2011 18:21:02 UTC