- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 18 Dec 2012 21:55:25 -0500
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: RDF-WG <public-rdf-wg@w3.org>
I haven't tried the productions below because I think the SPARQL grammar (also Andy's) provides a simpler solution:
-[7] predicateObjectList ::= verb objectList (';' predicateObjectList?)*
+[7] predicateObjectList ::= verb objectList (';' (verb objectList)? )*
http://www.w3.org/2005/01/yacker/uploads/turtleAwesome?lang=perl&markup=html#prod-turtleAwesome-predicateObjectList
[[
<s> <p> <o> .
<s> <p> <o> ; <p> <o> .
<s> <p> <o> ; <p> <o> ; .
<s> <p> <o> ; ; <p> <o> ; ; .
]]
works per <http://w3.org/brief/MzA1>
This should also work in LL land. Greg, could you confirm?
* Andy Seaborne <andy.seaborne@epimorphics.com> [2012-12-18 17:27+0000]
> tl:dr:
> The LC publication grammar can be implemented conflict-free in LALR.
> No change to the grammar is needed.
>
> ------------------------------
>
> I have made an attempt to mock up a Turtle parser using bison and
> flex in order to investigate:
>
> [[ http://www.w3.org/2011/rdf-wg/meeting/2012-12-12#resolution_2
> 1 change from prior publication
> 2 that the WG wants "<><><>;;;." to be legal
> 3 that we will look for a grammar representation which produces no
> S/R conflicts in LL(1) or LALR(1)
> ]]
>
> The LC text is the rule:
>
> [7] predicateObjectList
> ::= verb objectList (';' predicateObjectList?)*
>
> and I split that into two pieces so that the (...)* is a separate
> rule and can be left (or right) recursive. So I think the LC
> grammar is sufficient. Efficient LALR grammars are known to be hard
> to read for most people because of the counter-intuitive reduction
> from right [1].
>
> ---------------------
> predicateObjectList
> : predicate objectList predicateObjectList2
> ;
>
> predicateObjectList2
> :
> | predicateObjectList2 SEMICOLON
> | predicateObjectList2 SEMICOLON predicate objectList
> ;
> ---------------------
>
> It's not necessary to split the rule because directly as per LC [7]
> because this seems to work:
>
> ---------------------
> predicateObjectList
> : predicate objectList
> | predicate objectList semi
> | predicate objectList semi predicateObjectList
> ;
>
> semi : SEMICOLON | semi SEMICOLON
> ---------------------
> but is right-recursive which Eric does not like. Different writing
> of the grammar into LALR have different effects for the same
> language.
>
>
> The full mock up is available at
>
> http://people.apache.org/~andy/c-ttl-3.zip
>
> and it has no conflicts. There are three versions (ttl0.y & ttl1.y
> & ttl2.y) - the extracts above are ttl1.y and ttl2.y. ttl1 is
> harder to understand but possibly more efficient at scale, depending
> on the details of the LALR generator.
>
> It is not a complete Turtle parser - it skimps on the tokens but
> does cover all the language features. It even generates triples
> (except in collections because getting the rdf:first/rdf:rest right
> is distracting from the main point), to test the language actually
> parsed.
>
> It even has the other key LALR feature - only one error message
> "syntax error" without details. Any real LALR parser would
> incorporate additional rules to capture error conditions.
>
> Andy
>
> [1]
> http://en.wikipedia.org/wiki/LALR_parser#Implementation_issues
>
> [2]
> See also: parser generators that it is claimed accept fairly EBNF
> directly: I have not tried any:
>
> http://en.wikipedia.org/wiki/Comparison_of_parser_generators#Deterministic_context-free_languages
>
--
-ericP
Received on Wednesday, 19 December 2012 02:55:56 UTC