- From: John Cowan <cowan@mercury.ccil.org>
- Date: Fri, 24 Jan 2014 16:46:57 -0500
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: "CSS WWW Style (www-style@w3.org)" <www-style@w3.org>, www International <www-international@w3.org>
Phillips, Addison scripsit: > "A grapheme cluster is what a language user considers to be a > character or a basic unit of the script." > "The UA may further tailor the definition as required by > typographical tradition." > Example 1 > > I think a grapheme cluster should be defined in the CSS spec as > follows: A grapheme cluster is a sequence of characters as defined > by the Unicode specification that should be treated as a unit > for typographic processing. This generally approximates to what a > language user considers to be a letter or basic unit of the script. > > I don't think applications should redefine what a grapheme cluster > is; that definition is established by the Unicode standard. Rather, > we should say that applications sometimes require additional > rules beyond the use of 'grapheme clusters' in order to handle > the typographic traditions of particular scripts. The definition of "grapheme cluster" in the Unicode Glossary defers to UAX 29, but the current revision (23) of that UAX doesn't actually have a formal definition of "grapheme cluster", except as a cover term for default grapheme clusters, extended grapheme clusters, and tailored grapheme clusters, which *are* defined. It does, however, introduce the informal term "user-perceived character", and says that grapheme clusters (by implication, of one of the above varieties) are an approximation to user-perceived characters. This seems to me like good terminology to follow. -- I could dance with you till the cows John Cowan come home. On second thought, I'd http://www.ccil.org/~cowan rather dance with the cows when you cowan@ccil.org come home. --Rufus T. Firefly
Received on Friday, 24 January 2014 21:47:25 UTC