- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Sat, 20 Aug 2022 16:07:43 +0000
- To: ixml <public-ixml@w3.org>
- Message-Id: <1661010295946.893633420.1772228940@cwi.nl>
Ugh. I was making a test to see which class a character is in:
input: char*.
-char: {C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs; Unassigned.
Unassigned: ~[{C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs].
{ C: [ C].}
Cc: [Cc].
Cf: [Cf].
Cn: [Cn].
Co: [Co].
Cs: [Cs]. etc
For instance, input
!"#$%&'()*+,-./0:;<=>?@A[\]^_`a{|}~á×
gives
<input>
<Zs> </Zs>
<Po>!</Po>
<Po>"</Po>
<Po>#</Po>
<Sc>$</Sc>
<Po>%</Po>
<Po>&</Po>
<Po>'</Po>
<Ps>(</Ps>
<Pe>)</Pe>
<Po>*</Po>
<Sm>+</Sm>
<Po>,</Po>
<Pd>-</Pd>
<Po>.</Po>
<Po>/</Po>
<Nd>0</Nd>
<Po>:</Po>
<Po>;</Po>
<Sm><</Sm>
<Sm>=</Sm>
<Sm>></Sm>
<Po>?</Po>
<Po>@</Po>
<Lu>A</Lu>
<Ps>[</Ps>
<Po>\</Po>
<Pe>]</Pe>
<Sk>^</Sk>
<Pc>_</Pc>
<Sk>`</Sk>
<Ll>a</Ll>
<Ps>{</Ps>
<Sm>|</Sm>
<Pe>}</Pe>
<Sm>~</Sm>
<Ll>á</Ll>
<Unassigned>×</Unassigned>
</input>
This was to test my growing set of Unicode classes, and to spot mistakes.
Alas! I had completely failed to see in the past that there is a class LC! And the ixml rule for a class is:
-class: code.
@code: capital, letter?.
-capital: ["A"-"Z"].
-letter: ["a"-"z"].
Thus, our first bug...
Steven
Received on Saturday, 20 August 2022 16:08:01 UTC