what does <range from="0" to="9"/> mean?

[This issue was just raised on github as issue 17, and I repeat the
message here in order to get it into the email archive.]

Either I am missing something, or an ambiguity has been introduced in
the most recent revision of the ixml grammar.  At one point, the
definition of character read

    -character: -'"', dchar, -'"', S;
                -"'", schar, -"'", S;
                "#", hex, S.

In the revision of 12 November 2021, it reads

    -character: -'"', dchar, -'"', s;
                -"'", schar, -"'", s;
                -"#", hex, s.

The nonterminal *character* is used in the *from* and *to* values of a
*range*.

In the ixml form of a grammar, it will always be clear whether a
character range is written using hex values or quoted strings.  You
will have, for example, ["0" - "9"] or [#0 - #9], which mean
different things.

But given the tmark of "-" on the hash mark in the third right-hand
side of *character*, I think these with both turn into `<range
from="0" to="9"/>`, which may be interpreted as meaning any character
between U+0030 and U+0039, inclusive (the conventional Indo-Arabic
decimal numerals of ASCII and other seven- and eight-bit character
sets), or any character between U+0000 and U+0009, inclusive, meaning
a range of C0 characters.

If I'm missing something here, I hope someone will explain it to me.
Otherwise, I think the simple fix is to drop the "-" from the hash
mark in the third right-hand side of *character*.

-CMSMcQ


********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
cmsmcq@blackmesatech.com
http://www.blackmesatech.com
********************************************

Received on Monday, 22 November 2021 17:57:19 UTC