Hi Mark, Here is another comment on UAX29 that just occurred to me. The rules are extended to allow prepended characters to form a grapheme cluster with the following consonant(+any combining characters). It is not clear to me that this will produce the right effect in cases where there is an initial cluster in Thai or Lao. For example, a silent U+0EAB ຫ LAO LETTER HO SUN is sometimes used in Lao before another consonant to change the default tonal behaviour of of that consonant, eg. ແຫວນ (ring) 0EC1: ແ LAO VOWEL SIGN EI (Lao) 0EAB: ຫ LAO LETTER HO SUNG (Lao) 0EA7: ວ LAO LETTER WO (Lao) 0E99: ນ LAO LETTER NO (Lao) Here the EI is pronounced after the WO, not after the HO SUNG. Is is ok to split boundaries in this way. I wasn't able to find a better Lao example with a quick search, but presumably there can also be words with prepended parts of a complex vowel (eg. 0EC0: ເ LAO VOWEL SIGN E (Lao) Syllable-initial consonant(s) here. 0EB0: ະ LAO VOWEL SIGN A (Lao) = /e/) where this would split the parts of a single vowel too. Is that ok? Is it something we should add a note about in UAX 29. Another example, in Thai, would be โปรแกรม (program). 0E42: โ THAI CHARACTER SARA O (Thai) 0E1B: ป THAI CHARACTER PO PLA (Thai) 0E23: ร THAI CHARACTER RO RUA (Thai) 0E41: แ THAI CHARACTER SARA AE (Thai) 0E01: ก THAI CHARACTER KO KAI (Thai) 0E23: ร THAI CHARACTER RO RUA (Thai) 0E21: ม THAI CHARACTER MO MA (Thai) Here there are two prepended vowel signs that occur before a two-consonant cluster. RI ============ Richard Ishida Internationalization Lead W3C (World Wide Web Consortium) http://www.w3.org/International/ http://rishida.net/blog/ http://rishida.net/Received on Friday, 7 March 2008 15:14:12 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:53 GMT