Unassigned characters

 > Unassigned: ~[C; L; LC; M; N; P; S; Z].

It occurred to me that Unicode has a class Cn "Unassigned". Unsurprisingly, 
there are no characters in the Unicode database with this class.

So presumably if we had a grammar

input: char*.
char: assigned; unassigned.
-assigned: -~[Cn].
unassigned: [Cn].

this should only output characters in the input that are not assigned 
Unicode characters.

For discussion at a call sometime.


Received on Thursday, 15 December 2022 10:41:00 UTC