- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Sun, 27 Aug 2023 11:44:57 -0600
- To: John Lumley <john@saxonica.com>
- Cc: Norm Tovey-Walsh <norm@saxonica.com>, "Liam R. E. Quin" <liam@fromoldbooks.org>, graydonish@gmail.com, public-ixml@w3.org
John Lumley <john@saxonica.com> writes: > Sent from my iPad > >> On 27 Aug 2023, at 10:20, Norm Tovey-Walsh <norm@saxonica.com> wrote: >> >> Just using () helps: >> >> rule: name, "=", value; () . >> >> And you can certainly define your own terminal “empty”, but that won’t >> help with examples like the one that started this thread where the >> author chose not to do that. > > I think this is the simplest and most obvious. We could even modify > the syntax (non-backwards-compatible) to mandate the use of an empty > bracket pair in place of a whitespace-only alt… While I'm still a little worried about bells, whistles, and slippery slopes, John has touched my puzzle-solving button successfully and my "How would you do that?" instinct has successfully kicked in. What would we need to make this possible? I think it's just: 1 Change alt: term**(-",", s). to: alt: term++(-",", s). 2 Change -term: factor; option; repeat0; repeat1. to -term: factor; option; repeat0; repeat1; empty-sequence. -empty-sequence: '()'. 3 Add a prose specification that empty-sequence, i.e. '()', matches the empty sequence in the input string. I dislike having to define () as magic in step 3. ................, A magic-free alternative would be 1 Change alt: term**(-",", s). to: alt: term++(-",", s). 2 Observe in the prose that the empty string can be matched by writing []? Something in the simplicity of this appeals to me, but I don't think anyone particularly wants to write []? to denote the empty string. If it is not obvious why the term []? has the required meaning, I recommend it as a puzzle for a lazy Sunday afternoon or evening. ................ Oh, wait! Another magic-free alternative has just occurred to me: 1 Change alt: term**(-",", s). to: alt: term++(-",", s). 2 Change -factor: terminal; nonterminal; insertion; -"(", s, alts, -")", s. to -factor: terminal; nonterminal; insertion; -"(", s, alts?, -")", s. or (if we wanted to forbid whitespace and comments between the parentheses) to -factor: terminal; nonterminal; insertion; -"(", (s, alts)?, -")", s. 3 Note in the prose that as a consequence of the grammar rules (and in a change from the 1.0 spec), a term matching the empty string cannot be written using the empty string but must be written explicitly in the grammar, for example using (). ................ We should bear in mind that we can define explicit syntax like 'ε' or 'EMPTY-STRING' or '$empty_string' to mean the language consisting of the empty string, but we cannot require people to use it, since there is no way to prevent people from finding other ways to expres the same idea. But I think we can successfully define the grammar so that the empty string in an ixml grammar cannot be used to match the empty string in the input string. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Sunday, 27 August 2023 18:18:06 UTC