- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 19 Dec 2012 08:28:16 -0500
- To: Gregg Kellogg <gregg@greggkellogg.net>
- Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, RDF-WG <public-rdf-wg@w3.org>
* Gregg Kellogg <gregg@greggkellogg.net> [2012-12-19 02:07-0500] > On Dec 18, 2012, at 6:55 PM, Eric Prud'hommeaux <eric@w3.org> wrote: > > > I haven't tried the productions below because I think the SPARQL grammar (also Andy's) provides a simpler solution: > > > > -[7] predicateObjectList ::= verb objectList (';' predicateObjectList?)* > > +[7] predicateObjectList ::= verb objectList (';' (verb objectList)? )* > > > > http://www.w3.org/2005/01/yacker/uploads/turtleAwesome?lang=perl&markup=html#prod-turtleAwesome-predicateObjectList > > > > [[ > > <s> <p> <o> . > > <s> <p> <o> ; <p> <o> . > > <s> <p> <o> ; <p> <o> ; . > > <s> <p> <o> ; ; <p> <o> ; ; . > > ]] > > works per <http://w3.org/brief/MzA1> > > This should also work in LL land. Greg, could you confirm? > > That's the original version of this rule, and the one I use in my purely LL(1) parser. It is indeed simpler, the only reason to do the other is if you wanted a rule to disallow multiple semi-colons, which we decided not to do. Yeah, I poked around at the lineage last night. Turtle had it's own flavor from way back: http://www.dajobe.org/2004/01/turtle/2006-12-04/#predicateObjectList [5] predicateObjectList ::= verb ws+ objectList ( ws* ';' ws* verb ws+ objectList )* (ws* ';')? http://www.w3.org/TeamSubmission/turtle/#predicateObjectList [7] predicateObjectList ::= verb objectList ( ';' verb objectList )* ( ';')? http://www.w3.org/TR/2011/WD-turtle-20110809/#prod-turtle2-predicateObjectList [7] predicateObjectList ::= verb objectList ( ";" verb objectList )* (";")? http://www.w3.org/TR/2012/WD-turtle-20120710/#grammar-production-predicateObjectList [7] predicateObjectList ::= verb objectList (';' predicateObjectList?)* http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#grammar-production-predicateObjectList [7] predicateObjectList ::= verb objectList (';' predicateObjectList?)* SPARQL picked up the repeated ';'s around March 2007 (4 years before the Turtle Submission) http://www.w3.org/TR/2007/WD-rdf-sparql-query-20070326/#rPropertyListNotEmpty [33] PropertyListNotEmpty ::= Verb ObjectList ( ';' ( Verb ObjectList )? )* Anyways, the ;;; change permits us to align the two specs and simplify not just our tasks. > Gregg > > > * Andy Seaborne <andy.seaborne@epimorphics.com> [2012-12-18 17:27+0000] > >> tl:dr: > >> The LC publication grammar can be implemented conflict-free in LALR. > >> No change to the grammar is needed. > >> > >> ------------------------------ > >> > >> I have made an attempt to mock up a Turtle parser using bison and > >> flex in order to investigate: > >> > >> [[ http://www.w3.org/2011/rdf-wg/meeting/2012-12-12#resolution_2 > >> 1 change from prior publication > >> 2 that the WG wants "<><><>;;;." to be legal > >> 3 that we will look for a grammar representation which produces no > >> S/R conflicts in LL(1) or LALR(1) > >> ]] > >> > >> The LC text is the rule: > >> > >> [7] predicateObjectList > >> ::= verb objectList (';' predicateObjectList?)* > >> > >> and I split that into two pieces so that the (...)* is a separate > >> rule and can be left (or right) recursive. So I think the LC > >> grammar is sufficient. Efficient LALR grammars are known to be hard > >> to read for most people because of the counter-intuitive reduction > >> from right [1]. > >> > >> --------------------- > >> predicateObjectList > >> : predicate objectList predicateObjectList2 > >> ; > >> > >> predicateObjectList2 > >> : > >> | predicateObjectList2 SEMICOLON > >> | predicateObjectList2 SEMICOLON predicate objectList > >> ; > >> --------------------- > >> > >> It's not necessary to split the rule because directly as per LC [7] > >> because this seems to work: > >> > >> --------------------- > >> predicateObjectList > >> : predicate objectList > >> | predicate objectList semi > >> | predicate objectList semi predicateObjectList > >> ; > >> > >> semi : SEMICOLON | semi SEMICOLON > >> --------------------- > >> but is right-recursive which Eric does not like. Different writing > >> of the grammar into LALR have different effects for the same > >> language. > >> > >> > >> The full mock up is available at > >> > >> http://people.apache.org/~andy/c-ttl-3.zip > >> > >> and it has no conflicts. There are three versions (ttl0.y & ttl1.y > >> & ttl2.y) - the extracts above are ttl1.y and ttl2.y. ttl1 is > >> harder to understand but possibly more efficient at scale, depending > >> on the details of the LALR generator. > >> > >> It is not a complete Turtle parser - it skimps on the tokens but > >> does cover all the language features. It even generates triples > >> (except in collections because getting the rdf:first/rdf:rest right > >> is distracting from the main point), to test the language actually > >> parsed. > >> > >> It even has the other key LALR feature - only one error message > >> "syntax error" without details. Any real LALR parser would > >> incorporate additional rules to capture error conditions. > >> > >> Andy > >> > >> [1] > >> http://en.wikipedia.org/wiki/LALR_parser#Implementation_issues > >> > >> [2] > >> See also: parser generators that it is claimed accept fairly EBNF > >> directly: I have not tried any: > >> > >> http://en.wikipedia.org/wiki/Comparison_of_parser_generators#Deterministic_context-free_languages > >> > > > > -- > > -ericP > > > -- -ericP
Received on Wednesday, 19 December 2012 13:28:47 UTC