W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > August 2005

Re: Please make sure the grammar is directly machine consumable.

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Sun, 21 Aug 2005 17:38:58 +0100
Message-ID: <4308AE22.3080204@hp.com>
To: Richard Newman <holygoat@gmail.com>
Cc: Tim Berners-Lee <timbl@w3.org>, public-rdf-dawg-comments@w3.org, Yosi Scharf <syosi@mit.edu>

Richard Newman wrote:
> On 19 Aug 2005, at 04:12, Tim Berners-Lee wrote:
>> Richard,
>> I didn't realize the grammar in the spec is machine-generated.
>> Maybe it should be hand-edited and everything else
>> generated from it.
> I think that would be a good idea from one point of view (mine and  
> yours, certainly!), but we'd have to see what the current maintainers  
> of the SPARQL grammar think.
>> Yosi (on vacation right now) has generated (with a small hand tweak)
>> the CFG grammar in RDF from the spec.   (See sparql* in
>> http://www.w3.org/2000/10/swap/grammar/
>> )  This is in plain BNF (  cfg:mustBeOneSequence properties
>> with nested RDF collections )
>> See the bnf.n3 ontology in that directory as well as
>> the bnf-rules.n3 which go from some forms of ebnf to bnf,
>> also in that directory.
> Very handy (and pretty cool!). As it seems the tools are in place, it  
> would be nice to have a machine-readable 'spec' grammar that could be  
> re-purposed into presentation EBNF, JavaCC, plain BNF, etc. -- this  
> would certainly save me a lot of work whenever the grammar changes!
> It is also nice, in an "eating one's own dog food" way, to have the  
> grammar itself in RDF.
> -R

This is not a response to the comment - just a description of some details 
in case it helps.

The grammar is written using JavaCC, which, while an LL parser generator, 
also provides tools to do LA checking.  JavaCC also provides a text output 

The JavaCC text output is converted to the HTML for the document by a script 
although the tokens have to be manually described.  The process is 
converting javacc syntax to the EBNF syntax as described in

The grammar in javacc is not quite LL(1) (there is a 2 state lookahead at 
the Triples production - related to the optional dots Richard commented on). 
  The document grammar is also fed into yacker (a W3C tool) which checks for 
conversion to bison/flex (LALR(1)).

There are trade-off between readability by humans and processable by 
machines in the current grammar.  Some people find the weighting towards a 
machine-processable grammar makes the grammar unclear (e.g. the use of 
recursive rules use rather than repetition).

Received on Sunday, 21 August 2005 16:39:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:06 UTC