- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Sun, 21 Aug 2005 17:38:58 +0100
- To: Richard Newman <holygoat@gmail.com>
- Cc: Tim Berners-Lee <timbl@w3.org>, public-rdf-dawg-comments@w3.org, Yosi Scharf <syosi@mit.edu>
Richard Newman wrote: > > > On 19 Aug 2005, at 04:12, Tim Berners-Lee wrote: > >> Richard, >> >> I didn't realize the grammar in the spec is machine-generated. >> Maybe it should be hand-edited and everything else >> generated from it. > > > I think that would be a good idea from one point of view (mine and > yours, certainly!), but we'd have to see what the current maintainers > of the SPARQL grammar think. > >> Yosi (on vacation right now) has generated (with a small hand tweak) >> the CFG grammar in RDF from the spec. (See sparql* in >> http://www.w3.org/2000/10/swap/grammar/ >> ) This is in plain BNF ( cfg:mustBeOneSequence properties >> with nested RDF collections ) >> >> See the bnf.n3 ontology in that directory as well as >> the bnf-rules.n3 which go from some forms of ebnf to bnf, >> also in that directory. > > > Very handy (and pretty cool!). As it seems the tools are in place, it > would be nice to have a machine-readable 'spec' grammar that could be > re-purposed into presentation EBNF, JavaCC, plain BNF, etc. -- this > would certainly save me a lot of work whenever the grammar changes! > > It is also nice, in an "eating one's own dog food" way, to have the > grammar itself in RDF. > > -R > This is not a response to the comment - just a description of some details in case it helps. The grammar is written using JavaCC, which, while an LL parser generator, also provides tools to do LA checking. JavaCC also provides a text output format. The JavaCC text output is converted to the HTML for the document by a script although the tokens have to be manually described. The process is converting javacc syntax to the EBNF syntax as described in http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-notation. The grammar in javacc is not quite LL(1) (there is a 2 state lookahead at the Triples production - related to the optional dots Richard commented on). The document grammar is also fed into yacker (a W3C tool) which checks for conversion to bison/flex (LALR(1)). There are trade-off between readability by humans and processable by machines in the current grammar. Some people find the weighting towards a machine-processable grammar makes the grammar unclear (e.g. the use of recursive rules use rather than repetition). Andy
Received on Sunday, 21 August 2005 16:39:06 UTC