nested alts and EBNF/BNF conversion (pull request #146)

Two points occur to me which may be worth mentioning in connection with
PR #146:

First: Neither the spec nor the draft paper seem to mention nested sets
of alternatives.  As I understand the term 'BNF', the grammar

    S = 'a', ('b' | 'c'+), 'd'.

is not in BNF.  And it is certainly not handled directly by Earley's
description of his parsing algorithm.

Perhaps it should be mentioned?


Second: I have found it informative to try to make a compact list of the
various rewrite rules proposed.  Perhaps others will also find it
useful.

Ideally, perhaps, ths should all be in a table, so that any given set of
rewrite rules can be read as a row, and all the rewrites proposed for
any given construction can be read in one column.  Here, I will use a
column-major presentation, and label each rewrite with a code:

  A = the spec of January 2021
  B = ixml 1.0
  C = the 'BTW' rules given in the draft
  D = the 'JωL' rule given in the draft
  E = the rules listed in one of my comments on the draft
  F = the rules implicit in the bnf-from-ebnf-tc.xsl stylesheet in
      my Gingersnap library

I omit the '-' marks to reduce visual clutter.  But since it may matter
in some implementations, I treat an empty alt as different from an alt
containing an empty alts (see A and B for f? for an example), and
similarly the order of alternants.

For the most part, all the rule sets replace non-BNF constructs with a
single non-terminal and differ only in how they define that nonterminal;
for the repetition operators, however, F rewrites the repetition
expression with a choice, not with a single nonterminal.  (The attentive
reader may wonder why F is so verbose in its rewrites of repetitions;
the answer is "test-coverage".)

................................................................

{ABCDEF} f? => f-option

  {A} f-option = f | . 
  {B} f-option = f | (). 
  {E} f-option = () | f. 
  {F} f-option =  | f. 

................................................................

{ABCDE} f* => f-star

  {A} f-star = f, f-star | .
  {B} f-star = (f, f-star)?.
  {C} f-star = f+ | ().
  {D} f-star = f, f-star | ().
  {E} f-star = () | f, f-star.
  {E} f-star = () | f-star, f.

{F} f* => ( | f, more-f)

  {F} more-f = | f, more-f.

................................................................

{ABCDE} f+ => f-plus

  {A}  f-plus = f, f-plus-option .
  {B}  f-plus = f, f*.
  {CE} f-plus = f | f, f-plus.
  {E}  f-plus = f | f-plus, f.

  {A} f-plus-option = f-plus | .

{F} f+ => (f | f, f, more-f) { more-f as above }

................................................................

{ABCDE} f**s => f-star-sep

  {A} f-star-sep = f-plus-sep | . { f-plus-sep as below }
  {B} f-star-sep = (f++s)?.
  {C} f-star-sep = f-plus-sep | (). { f-plus-sep as below }
  {E} f-star-sep = f, more-s-f | ().

  {E} more-s-f = () | s, f, more-s-f.

{F} f**s => ( | f, more-s-f)

  {F} more-s-f- = | s, f, more-s-f-.

................................................................

{ABCDE} f++s => f-plus-sep

  {A} f-plus-sep = f, sep-part-option . 
  {B} f-plus-sep = f, (s, f)*.
  {E} f-plus-sep = f, more-s-f. { more-s-f as above }
  
  {A} sep-part-option = sep, f-plus-sep | .

{F} f++s => (f | f, s, f, more-s-f) { more-s-f- as above }

................................................................

{F} (f | g | ...) => f-g-...-alt

    {F} f-g-....-alt = f | g | ...

................................................................

{F} ((f | g | ...)) => (f | g | ...)


-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Tuesday, 11 October 2022 20:03:52 UTC