- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 18 Dec 2012 21:55:25 -0500
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: RDF-WG <public-rdf-wg@w3.org>
I haven't tried the productions below because I think the SPARQL grammar (also Andy's) provides a simpler solution: -[7] predicateObjectList ::= verb objectList (';' predicateObjectList?)* +[7] predicateObjectList ::= verb objectList (';' (verb objectList)? )* http://www.w3.org/2005/01/yacker/uploads/turtleAwesome?lang=perl&markup=html#prod-turtleAwesome-predicateObjectList [[ <s> <p> <o> . <s> <p> <o> ; <p> <o> . <s> <p> <o> ; <p> <o> ; . <s> <p> <o> ; ; <p> <o> ; ; . ]] works per <http://w3.org/brief/MzA1> This should also work in LL land. Greg, could you confirm? * Andy Seaborne <andy.seaborne@epimorphics.com> [2012-12-18 17:27+0000] > tl:dr: > The LC publication grammar can be implemented conflict-free in LALR. > No change to the grammar is needed. > > ------------------------------ > > I have made an attempt to mock up a Turtle parser using bison and > flex in order to investigate: > > [[ http://www.w3.org/2011/rdf-wg/meeting/2012-12-12#resolution_2 > 1 change from prior publication > 2 that the WG wants "<><><>;;;." to be legal > 3 that we will look for a grammar representation which produces no > S/R conflicts in LL(1) or LALR(1) > ]] > > The LC text is the rule: > > [7] predicateObjectList > ::= verb objectList (';' predicateObjectList?)* > > and I split that into two pieces so that the (...)* is a separate > rule and can be left (or right) recursive. So I think the LC > grammar is sufficient. Efficient LALR grammars are known to be hard > to read for most people because of the counter-intuitive reduction > from right [1]. > > --------------------- > predicateObjectList > : predicate objectList predicateObjectList2 > ; > > predicateObjectList2 > : > | predicateObjectList2 SEMICOLON > | predicateObjectList2 SEMICOLON predicate objectList > ; > --------------------- > > It's not necessary to split the rule because directly as per LC [7] > because this seems to work: > > --------------------- > predicateObjectList > : predicate objectList > | predicate objectList semi > | predicate objectList semi predicateObjectList > ; > > semi : SEMICOLON | semi SEMICOLON > --------------------- > but is right-recursive which Eric does not like. Different writing > of the grammar into LALR have different effects for the same > language. > > > The full mock up is available at > > http://people.apache.org/~andy/c-ttl-3.zip > > and it has no conflicts. There are three versions (ttl0.y & ttl1.y > & ttl2.y) - the extracts above are ttl1.y and ttl2.y. ttl1 is > harder to understand but possibly more efficient at scale, depending > on the details of the LALR generator. > > It is not a complete Turtle parser - it skimps on the tokens but > does cover all the language features. It even generates triples > (except in collections because getting the rdf:first/rdf:rest right > is distracting from the main point), to test the language actually > parsed. > > It even has the other key LALR feature - only one error message > "syntax error" without details. Any real LALR parser would > incorporate additional rules to capture error conditions. > > Andy > > [1] > http://en.wikipedia.org/wiki/LALR_parser#Implementation_issues > > [2] > See also: parser generators that it is claimed accept fairly EBNF > directly: I have not tried any: > > http://en.wikipedia.org/wiki/Comparison_of_parser_generators#Deterministic_context-free_languages > -- -ericP
Received on Wednesday, 19 December 2012 02:55:56 UTC