- From: Norm Tovey-Walsh <norm@saxonica.com>
- Date: Thu, 01 Sep 2022 10:14:44 +0100
- To: ixml <public-ixml@w3.org>
- Message-ID: <m24jxrjvei.fsf@saxonica.com>
Hi folks, Someone asked me (off list) about making railroad diagrams for ixml grammars. They pointed me to Gunther Rademacher’s online tool: https://www.bottlecaps.de/rr/ui That tool creates diagrams by parsing the W3C EBNF format. I did a little hacking by hand and produced a couple of halfway interesting diagrams. The W3C EBNF doesn’t have anything like inclusions or exclusions or Unicode character classes. I haven’t opened up the hood on the RR diagram library to see how hard it would be to create proper shapes for those constructs. Instead, I just fake them as strings. ["-.·‿⁀"; Nd; Mn] becomes a literal string in the RR diagram: INCL: "-.·‿⁀" | Nd | Mn It’s not ideal, but I don’t think it’s *too* bad. I’m also just ignoring all the marks in the EBNF because I’m not sure how to represent them. The obvious next step was to write an XSLT transformation to produce the approximated EBNF. Here’s ixml.ebnf that you can paste into the RR tool to see the diagrams: ixml ::= (s prolog? rule (RS rule)* s) s ::= (whitespace | comment ( whitespace | comment)*)? RS ::= whitespace | comment ( whitespace | comment)* whitespace ::= "INCL: Zs" | tab | lf | cr tab ::= '#x9' lf ::= '#xa' cr ::= '#xd' comment ::= ('{'(cchar | comment ( cchar | comment)*)? '}') cchar ::= 'EXCL: "{}"' prolog ::= (version s) version ::= ('ixml' RS 'version' RS string s '.') rule ::= (((mark s))? name s 'INCL: "=:"' s alts '.') mark ::= 'INCL: "@^-"' alts ::= alt (('INCL: ";|"' s) alt)* alt ::= (term ((',' s) term)*)? term ::= factor | option | repeat0 | repeat1 factor ::= terminal | nonterminal | insertion | ('(' s alts ')' s) repeat0 ::= (factor('*' s) | ('**' s sep)) repeat1 ::= (factor('+' s) | ('++' s sep)) option ::= (factor '?' s) sep ::= factor nonterminal ::= (((mark s))? name s) name ::= (namestart(namefollower ( namefollower)*)? ) namestart ::= 'INCL: "_" | L' namefollower ::= namestart | 'INCL: "-.·‿⁀" | Nd | Mn' terminal ::= literal | charset literal ::= quoted | encoded quoted ::= (((tmark s))? string s) tmark ::= 'INCL: "^-"' string ::= ('"' dchar ( dchar)* '"') | (''' schar ( schar)* ''') dchar ::= "EXCL: '#22' | #xa | #xd" | ('"' '"') schar ::= "EXCL: #22'#22 | #xa | #xd" | (''' ''') encoded ::= (((tmark s))? '#' hex s) hex ::= "INCL: [0-9] | [a-f] | [A-F]" ( "INCL: [0-9] | [a-f] | [A-F]")* charset ::= inclusion | exclusion inclusion ::= (((tmark s))? set) exclusion ::= (((tmark s))? '~' s set) set ::= ('[' s((member s) (('INCL: ";|"' s) (member s))*)? ']' s) member ::= string | ('#' hex) | range | class range ::= (from s '-' s to) from ::= character to ::= character character ::= ('"' dchar '"') | (''' schar ''') | ('#' hex) class ::= code code ::= (capital letter? ) capital ::= "INCL: [A-Z]" letter ::= "INCL: [a-z]" insertion ::= ('+' s string | ('#' hex)s) Be seeing you, norm -- Norm Tovey-Walsh Saxonica
Received on Thursday, 1 September 2022 09:28:55 UTC