- From: asmusf via GitHub <sysbot+gh@w3.org>
- Date: Sun, 29 Jan 2017 00:40:52 +0000
- To: public-i18n-archive@w3.org
The section states "In almost all of these cases, users may not be aware of or cannot be sure if a given document or text string has included or omitted one of these characters." This is actually only partially true. While users may not be aware of the presence of these characters, when they do affect the layout of a word, that difference is certainly visible to a user. That's why in IDNA 2008 there are context rules for ZWJ / ZWNJ that attempt to distinguish cases where the joiners have no effect (or only optional effect) and where they do. For example, adding a non-joiner where two characters would have joined makes a clear visual difference; doing the same thing between two characters that wouldn't have joined anyway remains undetectable. The rules treat the conjunct formation also as making these characters effect visible; that is, their presence is allowed in those locations in an IDNA2008 identifier (which makes them meaningful for matching). SHY is an example of a character which is always invisible _*except*_ when it causes the word to break across a line. There's a much stronger case for filtering it unconditionally in matching. Variation selectors are more akin to font selection; if no suitable glyphs exist, they have no visible effect. Removal seems fine. ZWSP and WJ (the replacement for ZWNBSP, aka BOM) are like SHY, they have no visible effects except in line breaking or segmentation. Removal seems fine. And so on. ZWJ for conjunc -- GitHub Notification of comment by asmusf Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/117#issuecomment-275885538 using your GitHub account
Received on Sunday, 29 January 2017 00:40:58 UTC