- From: Norm Tovey-Walsh <norm@saxonica.com>
- Date: Thu, 01 Sep 2022 14:11:53 +0100
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- Cc: public-ixml@w3.org
- Message-ID: <m2v8q7i5y1.fsf@saxonica.com>
Steven Pemberton <steven.pemberton@cwi.nl> writes: > I see some errors, but it's a nice start! > I think ONEOF is more readable than INCL, and NONEOF for EXCL Okay. I’m still trying to decide if I think its worth trying to get better support in the RR generator. > Something goes wrong with repeat0 and repeat1 Fixed in my follow-up post, or still wrong? > Something goes wrong with single quotes (see string as an > example). Yes. They seem to disappear entirely. > You could split individual members of a character set, so that > ["0"-"9"; "a"-"f"; "A"-"F"] appears as INCL "0"-"9"; INCL > "a"-"f"; INCL "A"-"F". This might give a better looking > diagram. Perhaps. I should probably put quotes into the ranges too. > The rule for insertion is wrong. Indeed. Failed to put parens around the alts. Here’s a better version: ixml ::= (s prolog? rule (RS rule)* s) s ::= ( (whitespace | comment))* RS ::= ( (whitespace | comment))+ whitespace ::= "INCL: Zs" | tab | lf | cr tab ::= '#x9' lf ::= '#xa' cr ::= '#xd' comment ::= ("{" ( (cchar | comment))* "}") cchar ::= "EXCL: {}" prolog ::= (version s) version ::= ("ixml" RS "version" RS string s ".") rule ::= (( ((mark s)))? name s "INCL: =:" s alts ".") mark ::= "INCL: @^-" alts ::= alt ( (("INCL: ;|" s)) alt)* alt ::= (term ( (("," s)) term)*)? term ::= factor | option | repeat0 | repeat1 factor ::= terminal | nonterminal | insertion | ("(" s alts ")" s) repeat0 ::= (factor (("*" s) | ("**" s sep))) repeat1 ::= (factor (("+" s) | ("++" s sep))) option ::= (factor "?" s) sep ::= factor nonterminal ::= (( ((mark s)))? name s) name ::= (namestart (namefollower)* ) namestart ::= "INCL: _ | L" namefollower ::= namestart | "INCL: -.·‿⁀ | Nd | Mn" terminal ::= literal | charset literal ::= quoted | encoded quoted ::= (( ((tmark s)))? string s) tmark ::= "INCL: ^-" string ::= ('"' (dchar)+ '"') | ("'" (schar)+ "'") dchar ::= 'EXCL: " | #xa | #xd' | ('"' '"') schar ::= "EXCL: ' | #xa | #xd" | ("'" "'") encoded ::= (( ((tmark s)))? "#" hex s) hex ::= ('INCL: ["0"-"9"] | ["a"-"f"] | ["A"-"F"]')+ charset ::= inclusion | exclusion inclusion ::= (( ((tmark s)))? set) exclusion ::= (( ((tmark s)))? "~" s set) set ::= ("[" s( ((member s)) ( (("INCL: ;|" s)) ((member s)))*)? "]" s) member ::= string | ("#" hex) | range | class range ::= (from s "-" s to) from ::= character to ::= character character ::= ('"' dchar '"') | ("'" schar "'") | ("#" hex) class ::= code code ::= (capital letter? ) capital ::= 'INCL: ["A"-"Z"]' letter ::= 'INCL: ["a"-"z"]' insertion ::= ("+" s (string | ("#" hex))s) Be seeing you, norm -- Norm Tovey-Walsh Saxonica
Received on Thursday, 1 September 2022 13:24:09 UTC