- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Wed, 14 Dec 2022 16:26:39 +0000
- To: ixml <public-ixml@w3.org>
Here are two useful tests to check an implementation's character classes. It just takes input and classifies each character according to what class it thinks it is.
==========
input: char*.
-char: C; L; LC; M; N; P; S; Z; Unassigned.
Unassigned: ~[C; L; LC; M; N; P; S; Z].
C: [ C].
L: [L].
LC: [LC].
M: [M].
N: [N].
P: [P].
S: [S].
Z: [Z].
=========
input: char*.
-char: {C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs; Unassigned.
Unassigned: ~[{C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs].
{ C: [ C].}
Cc: [Cc].
Cf: [Cf].
Cn: [Cn].
Co: [Co].
Cs: [Cs].
{ L: [L].}
{ LC: [LC].}
Ll: [Ll].
Lm: [Lm].
Lo: [Lo].
Lt: [Lt].
Lu: [Lu].
{ M: [M].}
Mc: [Mc].
Me: [Me].
Mn: [Mn].
{ N: [N].}
Nd: [Nd].
Nl: [Nl].
No: [No].
{ P: [P].}
Pc: [Pc].
Pd: [Pd].
Pe: [Pe].
Pf: [Pf].
Pi: [Pi].
Po: [Po].
Ps: [Ps].
{ S: [S].}
Sc: [Sc].
Sk: [Sk].
Sm: [Sm].
So: [So].
{ Z: [Z].}
Zl: [Zl].
Zp: [Zp].
Zs: [Zs].
===============
Example input: !"#$%&'()*+,-./0:;<=>?@A[\]^_`a{|}~á×÷ΛЮԱאا智取威虎山→DŽDždžſμµ
Output:
<input>
<Zs> </Zs>
<Po>!</Po>
<Po>"</Po>
<Po>#</Po>
<Sc>$</Sc>
<Po>%</Po>
<Po>&</Po>
<Po>'</Po>
<Ps>(</Ps>
<Pe>)</Pe>
<Po>*</Po>
<Sm>+</Sm>
<Po>,</Po>
<Pd>-</Pd>
<Po>.</Po>
<Po>/</Po>
<Nd>0</Nd>
<Po>:</Po>
<Po>;</Po>
<Sm><</Sm>
<Sm>=</Sm>
<Sm>></Sm>
<Po>?</Po>
<Po>@</Po>
<Lu>A</Lu>
<Ps>[</Ps>
<Po>\</Po>
<Pe>]</Pe>
<Sk>^</Sk>
<Pc>_</Pc>
<Sk>`</Sk>
<Ll>a</Ll>
<Ps>{</Ps>
<Sm>|</Sm>
<Pe>}</Pe>
<Sm>~</Sm>
<Ll>á</Ll>
<Sm>×</Sm>
<Sm>÷</Sm>
<Lu>Λ</Lu>
<Lu>Ю</Lu>
<Lu>Ա</Lu>
<Lo>א</Lo>
<Lo>ا</Lo>
<Lo>智</Lo>
<Lo>取</Lo>
<Lo>威</Lo>
<Lo>虎</Lo>
<Lo>山</Lo>
<Sm>→</Sm>
<Lu>DŽ</Lu>
<Lt>Dž</Lt>
<Ll>dž</Ll>
<Ll>ſ</Ll>
<Ll>μ</Ll>
<Ll>µ</Ll>
<Cc>
</Cc>
</input>
Received on Wednesday, 14 December 2022 16:26:55 UTC