W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2008

[UAX29] i18n comment 19: Word break algorithm

From: <ishida@w3.org>
Date: Fri, 07 Mar 2008 11:35:41 +0000
To: public-i18n-core@w3.org
Message-Id: <20080307113214.0A3C24F118@homer.w3.org>

Comment from the i18n review of:
http://www.unicode.org/reports/tr29/tr29-12.html

Comment 19
At http://www.w3.org/International/reviews/0801-uax29/
Editorial/substantive: E
Tracked by: AP

Location in reviewed document:
4 [http://www.unicode.org/reports/tr29/tr29-12.html#Word_Boundaries]

Comment:The problem with spaces in tailored word breaking should probably be added to the text. In particular, it should be pointed out (as with the Southeast Asian languages above) that the word break algorithm provides a "pretty good" default but that some more complex mechanisms may be needed to do a perfect job (with stuff like 1_234,56, where _ represents a space type character).
Received on Friday, 7 March 2008 11:32:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:53 GMT