Re: [css-text] I18N-ISSUE-308: Definition of 'grapheme cluster'

On Mon, Apr 21, 2014 at 4:41 AM, Phillips, Addison <addison@lab126.com>wrote:

> > Referring to UAX#29 here is a good idea, but could you confirm your
> intention
> > of the suggested change?
>
> The concern here was that the statement as written is exceedingly vague.
> There are many "typographic traditions" as there are many languages and
> scripts. Some guidance on what to do seemed warranted.
>
> > * “further tailor” to “extend grapheme cluster boundaries” looks like
> you’re
> > suggesting to prohibit shrinking grapheme cluster boundaries, but I
> suppose it’s
> > not your intention, is it? Isn’t “tailor” more appropriate word to use
> here, in
> > terms of giving more flexibilities to implementers, and it’s the word
> widely
> > used in UAX#29?
>
> In the main, we do mean "extend", since that what usually needs to happen.
> I can't, off hand, think of a case where the cluster is reduced in size,
> but that doesn't mean there isn't one. Tailor, as a result, is probably the
> better word choice.


I believe this wording came about because of an issue I raised for Thai.
 See

http://lists.w3.org/Archives/Public/www-style/2013Sep/0542.html
http://lists.w3.org/Archives/Public/www-style/2013Sep/0632.html

Fundamentally, the issue is that for Thai (and almost certainly Lao) there
are two distinct concepts which do not always coincide (though they often
do):

a) positions between *characters* that are possible caret positions

b) positions between *glyphs* where it is typographically conventional to
insert letter-spacing

The Unicode grapheme cluster concept is closest to (a).

(a) is a good starting point for (b), but in my view it is rather confusing
to treat (b) as a 'tailoring' of (a): they are different concepts operating
in different realms (characters vs glyphs).

James

Received on Monday, 21 April 2014 05:56:48 UTC