W3C home > Mailing lists > Public > www-international@w3.org > April to June 2014

Re: [css-text] I18N-ISSUE-308: Definition of 'grapheme cluster'

From: James Clark <jjc@jclark.com>
Date: Mon, 21 Apr 2014 12:55:59 +0700
Message-ID: <CANz3_EZWjx6qrZUGzLCRfqHWrphm_gBNpZpsYgNoug0bv0g74Q@mail.gmail.com>
To: "Phillips, Addison" <addison@lab126.com>
Cc: Koji Ishii <kojiishi@gluesoft.co.jp>, "CSS WWW Style (www-style@w3.org)" <www-style@w3.org>, www International <www-international@w3.org>
On Mon, Apr 21, 2014 at 4:41 AM, Phillips, Addison <addison@lab126.com>wrote:

> > Referring to UAX#29 here is a good idea, but could you confirm your
> intention
> > of the suggested change?
> The concern here was that the statement as written is exceedingly vague.
> There are many "typographic traditions" as there are many languages and
> scripts. Some guidance on what to do seemed warranted.
> > * “further tailor” to “extend grapheme cluster boundaries” looks like
> you’re
> > suggesting to prohibit shrinking grapheme cluster boundaries, but I
> suppose it’s
> > not your intention, is it? Isn’t “tailor” more appropriate word to use
> here, in
> > terms of giving more flexibilities to implementers, and it’s the word
> widely
> > used in UAX#29?
> In the main, we do mean "extend", since that what usually needs to happen.
> I can't, off hand, think of a case where the cluster is reduced in size,
> but that doesn't mean there isn't one. Tailor, as a result, is probably the
> better word choice.

I believe this wording came about because of an issue I raised for Thai.


Fundamentally, the issue is that for Thai (and almost certainly Lao) there
are two distinct concepts which do not always coincide (though they often

a) positions between *characters* that are possible caret positions

b) positions between *glyphs* where it is typographically conventional to
insert letter-spacing

The Unicode grapheme cluster concept is closest to (a).

(a) is a good starting point for (b), but in my view it is rather confusing
to treat (b) as a 'tailoring' of (a): they are different concepts operating
in different realms (characters vs glyphs).

Received on Monday, 21 April 2014 05:56:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:05 UTC