- From: Addison Phillips <addison@yahoo-inc.com>
- Date: Fri, 07 Mar 2008 14:04:08 -0800
- To: ishida@w3.org
- CC: public-i18n-core@w3.org
ishida@w3.org wrote: > Comment from the i18n review of: > http://www.unicode.org/reports/tr29/tr29-12.html > > Comment 18 > At http://www.w3.org/International/reviews/0801-uax29/ > Editorial/substantive: E > Tracked by: AP > > Location in reviewed document: > 4 [http://www.unicode.org/reports/tr29/tr29-12.html#Word_Boundaries] > > Comment:All of the examples include space-separated languages. No mention is made of the fact that some languages don't use spaces between words, which we think is an extremely important point to make. It should be explicitly mentioned here and possibly an example given. > > In reviewing the text, this note seems to address this comment: -- For Thai, Lao, Khmer, Myanmar, and other scripts that do not use typically use spaces between words, a good implementation should not just depend on the default word boundary specification, but should use a more sophisticated mechanism, as is also required for line breaking. Ideographic scripts such as Japanese and Chinese are even more complex. Where Hangul text is written without spaces, the same applies. However, in the absence of such a more sophisticated mechanism, the rules specified in this annex at least supply a well-defined default. -- On the other hand, I think it would be useful somewhere in the introductory area: too many programmers make assumptions about word-breaking behavior. So I would suggest adding something like the sentence marked ** to the second paragraph in Section 4 so that it reads like: -- Word boundaries can also be used in intelligent cut and paste. With this feature, if the user cuts a selection of text on word boundaries, adjacent spaces are collapsed to a single space. For example, cutting “quick” from “The_quick_fox” would leave “The_ _fox”. Intelligent cut and paste collapses this text to “The_fox”. **Note that word break boundaries are not restricted to whitespace and punctuation. Indeed, some languages do not use spaces at all.** Figure 1 gives an example of word boundaries. -- Addison -- Addison Phillips Globalization Architect -- Yahoo! Inc. Chair -- W3C Internationalization Core WG Internationalization is an architecture. It is not a feature.
Received on Friday, 7 March 2008 22:04:30 UTC