[charmod-norm] Function of WJ

Richard57 has just created a new issue for https://github.com/w3c/charmod-norm:

== Function of WJ ==
Section 2.5 says, "The Word Joiner is used to separate words in languages that do not use explicit spacing. An example would be the Thai language."

Where is the evidence that U+2060 is used to *separate* words?  TUS states that U+2060 WORD JOINER and U+FEFF ZWNBSP should only be used for line-breaking; this excludes any role in determining word boundaries.

The text was different in earlier versions.  For example, TUS 7.0 Section 23.2 says, 'U+2060 WORD JOINER behaves like U+00A0 NO-BREAK SPACE in that it indicates the absence of word boundaries; however _word joiner_ has no width'.  This encouraged WJ to be used to indicate the absence of a word boundary in Thai, but this capability was withdrawn in Unicode 8.0.0.  Implementers beware!


Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/170 using your GitHub account

Received on Friday, 11 May 2018 18:59:33 UTC