- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Sat, 24 Jun 2023 13:37:53 +0000
- To: ixml <public-ixml@w3.org>
- Message-Id: <1687613524840.1400920388.3110888934@cwi.nl>
REQUIREMENT To rename rules on serialisation BACKGROUND By definition, a rule accepts a single syntax on input. Thus different syntaxes have different element serialisations. However there are use-cases where you would like to rename a rule on serialisation. The simplest example is with different date formats: 31 December 1999 1999-12-31 In this case the year and day have the same syntax, but the two forms have different month syntaxes, so on serialisation, only one can be called 'month'. For instance: dates: s?, date**s, s?. date: day, -" ", month, -" ", year; year, -"-", nmonth, -"-", day. day: d, d?. year: d, d, d, d. month: -"January", +"01"; -"February", +"02"; -"March", +"03"; -"April", +"04"; -"May", +"05"; -"June", +"06"; -"July", +"07"; -"August", +"08"; -"September", +"09"; -"October", +"10"; -"November", +"11"; -"December", +"12". nmonth: "0"?, d; "1", ["0-2"]. -d: ["0"-"9"]. -s: -[" "; #9; #a; #d]+. With input 31 December 1999 1999-12-31 gives: <dates> <date> <day>31</day> <month>12</month> <year>1999</year> </date> <date> <year>1999</year> <nmonth>12</nmonth> <day>31</day> </date> </dates> The requirement is for a notation that says "The rule name is X, but on serialisation should be called Y". This applies to nonterminals, both elements and attributes, and should be usable on both definition and use. POSSIBLE SYNTAXES Noting that for 'renaming' of terminals, as with months above, you have the pattern -"May", +"05" One possibility might be: -nmonth+month: "0"?, d; "1", ["0-2"]. but that leading "-" is misleading, because the rule *will* be serialised, and it doesn't generalise to attributes. The following visually suggests a renaming: nmonth>month: "0"?, d; "1", ["0-2"]. Or nmonth^month: "0"?, d; "1", ["0-2"]. but this latter one doesn't work well for attributes, unless we used a different renaming operator for those (which I think is overkill). So my current preference falls to ">" to represent a renaming: rule: (mark, s)?, naming, -["=:"], s, -alts, -".". nonterminal: (mark, s)?, naming. -naming: name, s, (">", s, rename, s)?. @name: namestart, namefollower*. @rename: name. Or simplifying by factoring the mark into the naming rule: rule: naming, -["=:"], s, -alts, -".". nonterminal: naming. -naming: (mark, s)?, name, s, (">", s, rename, s)?. @name: namestart, namefollower*. @rename: name. This means that the serialised version of the grammar remains the same, except that <rule> and <nonterminal> can now also carry a @rename. <rule name="nmonth" rename="month"> ROUNDTRIPPING In passing, it is worth noting that although we haven't yet addressed roundtripping, this requirement does potentially introduce ambiguity in returning from the XML serialisation to the original form. In the dates example above the two can be distinguished by the order of day, month, year, but in the general case there can be two parses: <month>05</month> could be roundtripped as 05 or as May Steven
Received on Saturday, 24 June 2023 13:38:00 UTC