W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2008

RE: [UAX29] i18n comment 10: Different types of grapheme cluster?

From: Richard Ishida <ishida@w3.org>
Date: Fri, 7 Mar 2008 14:22:11 -0000
To: <public-i18n-core@w3.org>
Message-ID: <006101c8805e$a253ef30$e6fbcd90$@org>

Seems to be addressed a little more directly in the discussion about
tailoring of grapheme clusters, but still left up to the implementer.

RI



============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)
 
http://www.w3.org/International/
http://rishida.net/blog/
http://rishida.net/

 

> -----Original Message-----
> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> request@w3.org] On Behalf Of ishida@w3.org
> Sent: 07 March 2008 11:35
> To: public-i18n-core@w3.org
> Subject: [UAX29] i18n comment 10: Different types of grapheme cluster?
> 
> 
> Comment from the i18n review of:
> http://www.unicode.org/reports/tr29/tr29-12.html
> 
> Comment 10
> At http://www.w3.org/International/reviews/0801-uax29/
> Editorial/substantive: S
> Tracked by: RI
> 
> Location in reviewed document:
> 3 [http://www.unicode.org/reports/tr29/tr29-
> 12.html#Grapheme_Cluster_Boundaries]
> 
> Comment:
> We feel that the current definition of default grapheme clusters envisages
> only one way in which operations interact with grapheme clusters, whereas
> we probably require at least two different types of behaviour.
> 
> 
> For example, in the case of Khmer, the subscript consonants are viewed as
> distinct letters by Cambodians.
> 
> 
> On the one hand we suspect that it would make sense to delete the
> subjoined consonants separately from the 'base' character above them. This
> may not, however, be a question of deleting a character at a time - since
> it may be appropriate to delete vowel signs with the subjoined consonant.
> 
> 
> On the other hand, we do not expect that it would make sense to highlight
> the subjoined character and its vowel sign separately from the rest of the
> syllable, especially since there could be some discontinuity between the
> subscript consonant and the following vowel sign. Nor would you expect to
> see parts of these clusters wrapping separately either. (Especially since
> vowels can appear to the left or on both sides of the stack produced by
> coeng combinations.)
> 
> 
> 1780: ក KHMER LETTER KA
> 
> 
> 17D2: ្ KHMER SIGN COENG
> 
> 
> 179B: ល KHMER LETTER LO
> 
> 
> 17B8: ី KHMER VOWEL SIGN II
> 
> 
> See this as it would be rendered
> [http://www.w3.org/International/reviews/0801-uax29/khmerexample.gif].
> 
> 
> We find ourselves wondering whether there may be two different types of
> grapheme cluster rules, one that produces the correct behaviour for
> wrapping or highlighting and another to produce correct behaviour for
> backspace deletion.
> 
> 
> We would appreciate it if the authors of UAX 29 could point us to some
> discussions about this, or engage in some if they have not yet taken
place.
> 
> 
Received on Friday, 7 March 2008 14:18:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:53 GMT