- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Thu, 15 Dec 2022 10:40:46 +0000
- To: ixml <public-ixml@w3.org>
Received on Thursday, 15 December 2022 10:41:00 UTC
> Unassigned: ~[C; L; LC; M; N; P; S; Z]. It occurred to me that Unicode has a class Cn "Unassigned". Unsurprisingly, there are no characters in the Unicode database with this class. http://www.unicode.org/reports/tr44/#General_Category_Values So presumably if we had a grammar input: char*. char: assigned; unassigned. -assigned: -~[Cn]. unassigned: [Cn]. this should only output characters in the input that are not assigned Unicode characters. For discussion at a call sometime. Steven
Received on Thursday, 15 December 2022 10:41:00 UTC