UAX29, Prepended characters

Hi Mark,

Here is another comment on UAX29 that just occurred to me.

The rules are extended to allow prepended characters to form a grapheme cluster with the following consonant(+any combining characters).

It is not clear to me that this will produce the right effect in cases where there is an initial cluster in Thai or Lao.  

For example, a silent U+0EAB ຫ   LAO LETTER HO SUN is sometimes used in Lao before another consonant to change the default tonal behaviour of of that consonant, eg. ແຫວນ (ring)
0EC1:   ແ   LAO VOWEL SIGN EI (Lao)


0EA7:   ວ   LAO LETTER WO (Lao)

0E99:   ນ   LAO LETTER NO (Lao)

Here the EI is pronounced after the WO, not after the HO SUNG.  

Is is ok to split boundaries in this way.  I wasn't able to find a better Lao example with a quick search, but presumably there can also be words with prepended parts of a complex vowel (eg. 

    0EC0:   ເ   LAO VOWEL SIGN E (Lao)

    Syllable-initial consonant(s) here.

    0EB0:   ະ   LAO VOWEL SIGN A (Lao)

= /e/) where this would split the parts of a single vowel too.

Is that ok?  Is it something we should add a note about in UAX 29.

Another example, in Thai, would be โปรแกรม (program).  

    0E42:   โ   THAI CHARACTER SARA O (Thai)

    0E1B:   ป   THAI CHARACTER PO PLA (Thai)

    0E23:   ร   THAI CHARACTER RO RUA (Thai)

    0E41:   แ   THAI CHARACTER SARA AE (Thai)

    0E01:   ก   THAI CHARACTER KO KAI (Thai)

    0E23:   ร   THAI CHARACTER RO RUA (Thai)

    0E21:   ม   THAI CHARACTER MO MA (Thai)

Here there are two prepended vowel signs that occur before a two-consonant cluster.


Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)


Received on Friday, 7 March 2008 15:14:12 UTC