W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2008

UAX29, Prepended characters

From: Richard Ishida <ishida@w3.org>
Date: Fri, 7 Mar 2008 15:17:29 -0000
To: "'Mark Davis'" <mark.davis@icu-project.org>
Cc: <public-i18n-core@w3.org>
Message-ID: <007101c88066$5bf8f140$13ead3c0$@org>

Hi Mark,

Here is another comment on UAX29 that just occurred to me.

The rules are extended to allow prepended characters to form a grapheme cluster with the following consonant(+any combining characters).

It is not clear to me that this will produce the right effect in cases where there is an initial cluster in Thai or Lao.  

For example, a silent U+0EAB ຫ   LAO LETTER HO SUN is sometimes used in Lao before another consonant to change the default tonal behaviour of of that consonant, eg. ແຫວນ (ring)
0EC1:   ແ   LAO VOWEL SIGN EI (Lao)

0EAB:   ຫ   LAO LETTER HO SUNG (Lao)

0EA7:   ວ   LAO LETTER WO (Lao)

0E99:   ນ   LAO LETTER NO (Lao)

Here the EI is pronounced after the WO, not after the HO SUNG.  

Is is ok to split boundaries in this way.  I wasn't able to find a better Lao example with a quick search, but presumably there can also be words with prepended parts of a complex vowel (eg. 

    0EC0:   ເ   LAO VOWEL SIGN E (Lao)

    Syllable-initial consonant(s) here.

    0EB0:   ະ   LAO VOWEL SIGN A (Lao)

= /e/) where this would split the parts of a single vowel too.

Is that ok?  Is it something we should add a note about in UAX 29.

Another example, in Thai, would be โปรแกรม (program).  

    0E42:   โ   THAI CHARACTER SARA O (Thai)

    0E1B:   ป   THAI CHARACTER PO PLA (Thai)

    0E23:   ร   THAI CHARACTER RO RUA (Thai)

    0E41:   แ   THAI CHARACTER SARA AE (Thai)

    0E01:   ก   THAI CHARACTER KO KAI (Thai)

    0E23:   ร   THAI CHARACTER RO RUA (Thai)

    0E21:   ม   THAI CHARACTER MO MA (Thai)

Here there are two prepended vowel signs that occur before a two-consonant cluster.


RI


============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)
 
http://www.w3.org/International/
http://rishida.net/blog/
http://rishida.net/

 
Received on Friday, 7 March 2008 15:14:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:53 GMT