- From: David Robillard <d@drobilla.net>
- Date: Thu, 15 Dec 2011 12:48:52 -0500
- To: Gregg Kellogg <gregg@kellogg-assoc.com>
- Cc: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>, Gavin Carothers <gavin@carothers.name>
On Thu, 2011-12-15 at 10:08 -0500, Gregg Kellogg wrote: > I believe that grammar rule [7] predicateObjectList [1] is not LL(1) and requires look ahead to know what branch to go into. For example: Turtle has never been LL(1). You need readahead for BooleanLiteral, since "true" or "false" could also be the start of a PrefixedName. This is the worst case, 6 character readahead. Similarly, [9] verb ::= predicate | 'a' Requires a 2 character readahead (to check if the 'a' is followed by whitespace since 'a' can start a predicate. In general, qualified names and keywords are ambiguous while parsing. IMO either qualified names should have had quoting ("[foo:bar]", perhaps), or the special keywords ("a", "true", "false") should have had a unique prefix character, which would solve this problem and make the grammar extensible, perhaps even 'dynamically' via a @keyword directive. It's too late for that now, however. I also had to use it in my parser to correctly handle quote characters in long string literals, since you can read up to 2 of them and have it not terminate the string, i.e. every time you encounter a quote you must read ahead 3 characters to determine if this is the end of the string literal. I don't see how this could have been avoided, other than simply making single quote strings be long literals, but this would have meant quotes would always need escaping in a string literal. -dr
Received on Sunday, 18 December 2011 12:22:34 UTC