- From: Addison Phillips via GitHub <sysbot+gh@w3.org>
- Date: Sat, 28 Jan 2017 00:05:58 +0000
- To: public-i18n-archive@w3.org
aphillips has just created a new issue for https://github.com/w3c/charmod-norm: == which characters, exactly, should be removed in the matching algorithm? == In the latest version of the matching algorithm, I noticed that the advice is to "remove Unicode controls", but this was non-specific and linked to the section on invisibles. I created the following list and also had a question about whether this was complete or correct: > Issue 1 > > What to do about non-breaking space and other space characters? Is this the full list? What about the > Mongolian characters? > Remove all of the following invisible Unicode characters: > > ZWJ, ZWNJ > Variation Selectors (FE00..FE0F) > COMBINING GRAPHEME JOINER 034F > SOFT HYPHEN 00AD > ZERO WIDTH SPACE 200B > Bidi controls Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/117 using your GitHub account
Received on Saturday, 28 January 2017 00:06:04 UTC