- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
 - Date: Mon, 22 Nov 2021 10:57:38 -0700
 - To: ixml <public-ixml@w3.org>
 - Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
 
[This issue was just raised on github as issue 17, and I repeat the
message here in order to get it into the email archive.]
Either I am missing something, or an ambiguity has been introduced in
the most recent revision of the ixml grammar.  At one point, the
definition of character read
    -character: -'"', dchar, -'"', S;
                -"'", schar, -"'", S;
                "#", hex, S.
In the revision of 12 November 2021, it reads
    -character: -'"', dchar, -'"', s;
                -"'", schar, -"'", s;
                -"#", hex, s.
The nonterminal *character* is used in the *from* and *to* values of a
*range*.
In the ixml form of a grammar, it will always be clear whether a
character range is written using hex values or quoted strings.  You
will have, for example, ["0" - "9"] or [#0 - #9], which mean
different things.
But given the tmark of "-" on the hash mark in the third right-hand
side of *character*, I think these with both turn into `<range
from="0" to="9"/>`, which may be interpreted as meaning any character
between U+0030 and U+0039, inclusive (the conventional Indo-Arabic
decimal numerals of ASCII and other seven- and eight-bit character
sets), or any character between U+0000 and U+0009, inclusive, meaning
a range of C0 characters.
If I'm missing something here, I hope someone will explain it to me.
Otherwise, I think the simple fix is to drop the "-" from the hash
mark in the third right-hand side of *character*.
-CMSMcQ
********************************************
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
cmsmcq@blackmesatech.com
http://www.blackmesatech.com
********************************************
Received on Monday, 22 November 2021 17:57:19 UTC