W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2008

[UAX29] i18n comment 10: Different types of grapheme cluster?

From: <ishida@w3.org>
Date: Fri, 07 Mar 2008 11:34:32 +0000
To: public-i18n-core@w3.org
Message-Id: <20080307113105.E019E4F118@homer.w3.org>

Comment from the i18n review of:
http://www.unicode.org/reports/tr29/tr29-12.html

Comment 10
At http://www.w3.org/International/reviews/0801-uax29/
Editorial/substantive: S
Tracked by: RI

Location in reviewed document:
3 [http://www.unicode.org/reports/tr29/tr29-12.html#Grapheme_Cluster_Boundaries]

Comment: 
We feel that the current definition of default grapheme clusters envisages only one way in which operations interact with grapheme clusters, whereas we probably require at least two different types of behaviour.

 
For example, in the case of Khmer, the subscript consonants are viewed as distinct letters by Cambodians.

 
On the one hand we suspect that it would make sense to delete the subjoined consonants separately from the 'base' character above them. This may not, however, be a question of deleting a character at a time - since it may be appropriate to delete vowel signs with the subjoined consonant. 

 
On the other hand, we do not expect that it would make sense to highlight the subjoined character and its vowel sign separately from the rest of the syllable, especially since there could be some discontinuity between the subscript consonant and the following vowel sign. Nor would you expect to see parts of these clusters wrapping separately either. (Especially since vowels can appear to the left or on both sides of the stack produced by coeng combinations.) 

 
1780: ក KHMER LETTER KA 

 
17D2: ្ KHMER SIGN COENG 

 
179B: ល KHMER LETTER LO 

 
17B8: ី KHMER VOWEL SIGN II

 
See this as it would be rendered [http://www.w3.org/International/reviews/0801-uax29/khmerexample.gif]. 

 
We find ourselves wondering whether there may be two different types of grapheme cluster rules, one that produces the correct behaviour for wrapping or highlighting and another to produce correct behaviour for backspace deletion. 

 
We would appreciate it if the authors of UAX 29 could point us to some discussions about this, or engage in some if they have not yet taken place. 

 
Received on Friday, 7 March 2008 11:31:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:53 GMT