- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 7 Mar 2008 15:17:29 -0000
- To: "'Mark Davis'" <mark.davis@icu-project.org>
- Cc: <public-i18n-core@w3.org>
Hi Mark,
Here is another comment on UAX29 that just occurred to me.
The rules are extended to allow prepended characters to form a grapheme cluster with the following consonant(+any combining characters).
It is not clear to me that this will produce the right effect in cases where there is an initial cluster in Thai or Lao.
For example, a silent U+0EAB ຫ LAO LETTER HO SUN is sometimes used in Lao before another consonant to change the default tonal behaviour of of that consonant, eg. ແຫວນ (ring)
0EC1: ແ LAO VOWEL SIGN EI (Lao)
0EAB: ຫ LAO LETTER HO SUNG (Lao)
0EA7: ວ LAO LETTER WO (Lao)
0E99: ນ LAO LETTER NO (Lao)
Here the EI is pronounced after the WO, not after the HO SUNG.
Is is ok to split boundaries in this way. I wasn't able to find a better Lao example with a quick search, but presumably there can also be words with prepended parts of a complex vowel (eg.
0EC0: ເ LAO VOWEL SIGN E (Lao)
Syllable-initial consonant(s) here.
0EB0: ະ LAO VOWEL SIGN A (Lao)
= /e/) where this would split the parts of a single vowel too.
Is that ok? Is it something we should add a note about in UAX 29.
Another example, in Thai, would be โปรแกรม (program).
0E42: โ THAI CHARACTER SARA O (Thai)
0E1B: ป THAI CHARACTER PO PLA (Thai)
0E23: ร THAI CHARACTER RO RUA (Thai)
0E41: แ THAI CHARACTER SARA AE (Thai)
0E01: ก THAI CHARACTER KO KAI (Thai)
0E23: ร THAI CHARACTER RO RUA (Thai)
0E21: ม THAI CHARACTER MO MA (Thai)
Here there are two prepended vowel signs that occur before a two-consonant cluster.
RI
============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)
http://www.w3.org/International/
http://rishida.net/blog/
http://rishida.net/
Received on Friday, 7 March 2008 15:14:12 UTC