- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Wed, 14 Dec 2022 16:26:39 +0000
- To: ixml <public-ixml@w3.org>
Here are two useful tests to check an implementation's character classes. It just takes input and classifies each character according to what class it thinks it is. ========== input: char*. -char: C; L; LC; M; N; P; S; Z; Unassigned. Unassigned: ~[C; L; LC; M; N; P; S; Z]. C: [ C]. L: [L]. LC: [LC]. M: [M]. N: [N]. P: [P]. S: [S]. Z: [Z]. ========= input: char*. -char: {C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs; Unassigned. Unassigned: ~[{C;} Cc; Cf; Cn; Co; Cs; {L;} {LC;} Ll; Lm; Lo; Lt; Lu; {M;} Mc; Me; Mn; {N;} Nd; Nl; No; {P;} Pc; Pd; Pe; Pf; Pi; Po; Ps; {S;} Sc; Sk; Sm; So; {Z;} Zl; Zp; Zs]. { C: [ C].} Cc: [Cc]. Cf: [Cf]. Cn: [Cn]. Co: [Co]. Cs: [Cs]. { L: [L].} { LC: [LC].} Ll: [Ll]. Lm: [Lm]. Lo: [Lo]. Lt: [Lt]. Lu: [Lu]. { M: [M].} Mc: [Mc]. Me: [Me]. Mn: [Mn]. { N: [N].} Nd: [Nd]. Nl: [Nl]. No: [No]. { P: [P].} Pc: [Pc]. Pd: [Pd]. Pe: [Pe]. Pf: [Pf]. Pi: [Pi]. Po: [Po]. Ps: [Ps]. { S: [S].} Sc: [Sc]. Sk: [Sk]. Sm: [Sm]. So: [So]. { Z: [Z].} Zl: [Zl]. Zp: [Zp]. Zs: [Zs]. =============== Example input: !"#$%&'()*+,-./0:;<=>?@A[\]^_`a{|}~á×÷ΛЮԱאا智取威虎山→DŽDždžſμµ Output: <input> <Zs> </Zs> <Po>!</Po> <Po>"</Po> <Po>#</Po> <Sc>$</Sc> <Po>%</Po> <Po>&</Po> <Po>'</Po> <Ps>(</Ps> <Pe>)</Pe> <Po>*</Po> <Sm>+</Sm> <Po>,</Po> <Pd>-</Pd> <Po>.</Po> <Po>/</Po> <Nd>0</Nd> <Po>:</Po> <Po>;</Po> <Sm><</Sm> <Sm>=</Sm> <Sm>></Sm> <Po>?</Po> <Po>@</Po> <Lu>A</Lu> <Ps>[</Ps> <Po>\</Po> <Pe>]</Pe> <Sk>^</Sk> <Pc>_</Pc> <Sk>`</Sk> <Ll>a</Ll> <Ps>{</Ps> <Sm>|</Sm> <Pe>}</Pe> <Sm>~</Sm> <Ll>á</Ll> <Sm>×</Sm> <Sm>÷</Sm> <Lu>Λ</Lu> <Lu>Ю</Lu> <Lu>Ա</Lu> <Lo>א</Lo> <Lo>ا</Lo> <Lo>智</Lo> <Lo>取</Lo> <Lo>威</Lo> <Lo>虎</Lo> <Lo>山</Lo> <Sm>→</Sm> <Lu>DŽ</Lu> <Lt>Dž</Lt> <Ll>dž</Ll> <Ll>ſ</Ll> <Ll>μ</Ll> <Ll>µ</Ll> <Cc> </Cc> </input>
Received on Wednesday, 14 December 2022 16:26:55 UTC