- From: Uma Umamaheswaran <umavs@ca.ibm.com>
- Date: Fri, 11 Mar 2011 10:59:49 -0500
- To: WWW International <www-international@w3.org>
- Cc: kennyluck@w3.org, kojiishi@gluesoft.co.jp
Hello: In response to proposed Thai line breaking rules from Koji Ishii <kojiishi@gluesoft.co.jp>, I have added some feedback from Nattapong Sirilappanich (natta@th.ibm.com) of IBM Thailand.: >>> (From Koji Ishii) Here's his draft of the simple line breaking rules in the absence of a dictionary for Thai scripts. Any corrections, and/or opinions whether to include this in the spec or not would be appreciated. Thai character groups are based on TIS 620-2553 as written in Unicode spec[3]. Consonants: U+0E01-0E2E Line breaks are prohibited between: * Any and U+0E2F * <Consonants> and [U+0E30-0E3A] * [U+0E31, U+0E3A, U+0E40-0E44] and <Consonants> * U+0E3F THAI Currency Symbol BAHT and digits * [U+0E24, U+0E26] and U+0E45 * [U+0E50-0E59] and [U+0E50-0E59] * Any and U+0E5A Following rules are also presented, but they are Unicode Lm or Mn category and therefore I suspect that UAX#29 Unicode Text Segmentation should cover these rules. * Any and U+0E46 * <Consonants> and [U+0E47] * (<Consonants> or [U+0E34-0E39]) and [U+0E48-0E4B] * (<Consonants> or [U+0E34-0E39]) and U+0E4C * <Consonants> and [U+0E4D-0E4E] [1] http://www.aa.tufs.ac.jp/en [2] http://dev.w3.org/csswg/css3-text/#line-breaking [3] http://unicode.org/charts/PDF/U0E00.pdf [4] http://unicode.org/reports/tr29/ <<<< >>> Feedback from Nattapong Sirilappanich "I am agreed with all your rules and I have additional rules for you. Let's me define additional non-terminal symbol. Tone: 0E48-0E4B. AD (Above Diacritic): 0E4C and 0E4E. The additional rules are. 0E31 and <Tone>. (0E34 or 0E38) and <AD>. <Tone> and (0E30, 0E32 or 0E33). Best regards, Uma V.S. UMAmaheswaran, Ph.D. Globalization Centre of Competency, IBM Toronto Lab A3/SZ8, 8200 Warden Avenue, Markham, ON, Canada, L6G1C7; +1 905 413 3474; Fax: +1 905 413 4751; TieLine 313-3474; email: umavs@ca.ibm.com
Received on Friday, 11 March 2011 16:00:36 UTC