Re: Cheap and cheerful railroad diagrams

Steven Pemberton <steven.pemberton@cwi.nl> writes:
> I see some errors, but it's a nice start!
>  I think ONEOF is more readable than INCL, and NONEOF for EXCL

Okay. I’m still trying to decide if I think its worth trying to get
better support in the RR generator.

>  Something goes wrong with repeat0 and repeat1

Fixed in my follow-up post, or still wrong?

>  Something goes wrong with single quotes (see string as an
>  example).

Yes. They seem to disappear entirely.

>  You could split individual members of a character set, so that
>  ["0"-"9"; "a"-"f"; "A"-"F"] appears as INCL "0"-"9"; INCL
>  "a"-"f"; INCL "A"-"F". This might give a better looking
>  diagram.

Perhaps. I should probably put quotes into the ranges too.

>  The rule for insertion is wrong.

Indeed. Failed to put parens around the alts.

Here’s a better version:

ixml ::= (s prolog? rule (RS rule)* s)
s ::=  ( (whitespace | comment))* 
RS ::=  ( (whitespace | comment))+ 
whitespace ::=  "INCL: Zs" | tab | lf | cr 
tab ::=  '#x9' 
lf ::=  '#xa' 
cr ::=  '#xd' 
comment ::= ("{" ( (cchar | comment))* "}")
cchar ::=  "EXCL: {}" 
prolog ::= (version s)
version ::= ("ixml" RS "version" RS string s ".")
rule ::= (( ((mark s)))? name s "INCL: =:" s alts ".")
mark ::=  "INCL: @^-" 
alts ::=  alt ( (("INCL: ;|" s)) alt)* 
alt ::= (term ( (("," s)) term)*)? 
term ::=  factor | option | repeat0 | repeat1 
factor ::=  terminal | nonterminal | insertion | ("(" s alts ")" s)
repeat0 ::= (factor (("*" s) | ("**" s sep)))
repeat1 ::= (factor (("+" s) | ("++" s sep)))
option ::= (factor "?" s)
sep ::=  factor 
nonterminal ::= (( ((mark s)))? name s)
name ::= (namestart (namefollower)* )
namestart ::=  "INCL: _ | L" 
namefollower ::=  namestart | "INCL: -.·‿⁀ | Nd | Mn" 
terminal ::=  literal | charset 
literal ::=  quoted | encoded 
quoted ::= (( ((tmark s)))? string s)
tmark ::=  "INCL: ^-" 
string ::= ('"' (dchar)+ '"') | ("'" (schar)+ "'")
dchar ::=  'EXCL: " | #xa | #xd' | ('"' '"')
schar ::=  "EXCL: ' | #xa | #xd" | ("'" "'")
encoded ::= (( ((tmark s)))? "#" hex s)
hex ::=  ('INCL: ["0"-"9"] | ["a"-"f"] | ["A"-"F"]')+ 
charset ::=  inclusion | exclusion 
inclusion ::= (( ((tmark s)))? set)
exclusion ::= (( ((tmark s)))? "~" s set)
set ::= ("[" s( ((member s)) ( (("INCL: ;|" s))  ((member s)))*)? "]" s)
member ::=  string | ("#" hex) | range | class 
range ::= (from s "-" s to)
from ::=  character 
to ::=  character 
character ::= ('"' dchar '"') | ("'" schar "'") | ("#" hex)
class ::=  code 
code ::= (capital letter? )
capital ::=  'INCL: ["A"-"Z"]' 
letter ::=  'INCL: ["a"-"z"]' 
insertion ::= ("+" s (string | ("#" hex))s)


                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Thursday, 1 September 2022 13:24:09 UTC