Re: Identification of root rule from Steven Pemberton on 2023-10-17 (public-ixml@w3.org from October 2023)

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Tue, 17 Oct 2023 13:58:34 +0000
To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Cc: public-ixml@w3.org
Message-Id: <1697551008629.1320865410.1377276260@cwi.nl>

With almost all grammars it is easy to identify the root. It's the only 
unused rule.

In a tiny number of extreme cases where the root is used in the body of the 
grammar, you can add:

 -root: program.

(where 'program' is the real root') and all is well.

Steven

On Tuesday 17 October 2023 15:39:22 (+02:00), C. M. Sperberg-McQueen wrote:

 > 
 > Steven Pemberton <steven.pemberton@cwi.nl> writes:
 > 
 > > I noted a user of my implementation having lots of trouble this week,
 > > which they were unable to resolve.
 > >
 > > My original implementation identified the top-level rule by analysing
 > > the grammar, but we later resolved that the top-level rule had to be
 > > the first rule in the grammar.
 > 
 > Is there a reliable way to determine the start symbol by analysing the
 > productions?
 > 
 > I think there is not.  So either there must be a convention like the one
 > we use (parallel, if memory serves, to conventions in some other parsing
 > systems), or there must be additional syntax for identifying the start
 > symbol.
 > 
 > > This user apparently comes from a define-before-use background, and so
 > > consistently had the root rule as the last in the file. As a result,
 > > they didn't manage to get a single successful result.
 > >
 > > I'm not sure what to make of this. On the one hand, the spec clearly 
says:
 > >
 > >  "The root symbol of the grammar is the name of the first rule
 > >  in the grammar."
 > >
 > >
 > > On the other hand, I feel bad for the user; I think notations should
 > > try to serve users, and not the other way round: usability
 > > first.
 > 
 > I feel bad for them, too.  If I remember correctly, the Algol 60 report
 > also works bottom up, with a definition-before-use organizing principle.
 > 
 > > Which is why I did my original implementation that way.
 > 
 > I wonder whether you had any user who defined a grammar like
 > 
 >     A = B; 'a'.
 >     B = A; 'b'.
 > 
 > If our plan is to identify the start symbol by selecting a terminal not
 > referred to, we are doomed to disappointment in this or in any grammar
 > where the start symbol is recursive.  If we try to avoid that
 > disappointment by requiring that the start symbol not be recursive, we
 > interfere with the design of the grammar in a really big way.
 > 
 > > Anyway, it's a potential discussion point.
 > 
 > Yes, agreed.
 >

Received on Tuesday, 17 October 2023 13:58:47 UTC