- From: Kang-Hao (Kenny) Lu <kennyluck@w3.org>
- Date: Mon, 07 Feb 2011 15:30:08 +0900
- To: WWW International <www-international@w3.org>
- Message-ID: <4D4F9170.6030404@w3.org>
-------- Original Message -------- Subject: [css3-text] Thai line breaking rules Resent-Date: Mon, 07 Feb 2011 03:47:45 +0000 Resent-From: www-style@w3.org Date: Sun, 6 Feb 2011 22:47:21 -0500 From: Koji Ishii <kojiishi@gluesoft.co.jp> To: www-style@w3.org <www-style@w3.org> I had a meeting with ILCAA, Research Institute for Languages and Cultures of Asia and Africa[1] in Tokyo. Minegishi-san at ILCAA presented his idea for the issue currently mentioned in the CSS3 Text spec[2]: > Additionally, some guidance should be provided on how to break > or not break Southeast Asian in the absence of a dictionary. Here's his draft of the simple line breaking rules in the absence of a dictionary for Thai scripts. Any corrections, and/or opinions whether to include this in the spec or not would be appreciated. Thai character groups are based on TIS 620-2553 as written in Unicode spec[3]. Consonants: U+0E01-0E2E Line breaks are prohibited between: * Any and U+0E2F *<Consonants> and [U+0E30-0E3A] * [U+0E31, U+0E3A, U+0E40-0E44] and<Consonants> * U+0E3F THAI Currency Symbol BAHT and digits * [U+0E24, U+0E26] and U+0E45 * [U+0E50-0E59] and [U+0E50-0E59] * Any and U+0E5A Following rules are also presented, but they are Unicode Lm or Mn category and therefore I suspect that UAX#29 Unicode Text Segmentation should cover these rules. * Any and U+0E46 *<Consonants> and [U+0E47] * (<Consonants> or [U+0E34-0E39]) and [U+0E48-0E4B] * (<Consonants> or [U+0E34-0E39]) and U+0E4C *<Consonants> and [U+0E4D-0E4E] [1] http://www.aa.tufs.ac.jp/en [2] http://dev.w3.org/csswg/css3-text/#line-breaking [3] http://unicode.org/charts/PDF/U0E00.pdf [4] http://unicode.org/reports/tr29/ Regards, Koji
Received on Monday, 7 February 2011 06:29:16 UTC