W3C home > Mailing lists > Public > www-style@w3.org > September 2013

[css-text] Clusters for letter spacing in Thai and other complex scripts

From: James Clark <jjc@jclark.com>
Date: Sun, 22 Sep 2013 12:37:53 +0700
Message-ID: <CANz3_EY_=0WV1YVVyH_KviNGUW2C0m_NL9eZUfkREt8YviTjKA@mail.gmail.com>
To: www-style@w3.org
The Editor's Draft for CSS Text says that letter-spacing is applied between
adjacent "characters", where a "character" is defined as a UAX29 extended
grapheme cluster. This doesn't do the right thing in Thai in the case of
SARA AM (U+0E33).

For example, to properly letter-space the word  (0E04 + 0E33), the 0E33
needs to be decomposed into 0E4D + 0E32, and then the extra letter-space
inserted before the 0E32:  .  A slightly more complex example is 
(0E19 + 0E49 + 0E33).  In this case, normal Thai shaping will first
decompose the 0E33 into 0E4D + 0E32 and then swap the 0E4D with the 0E49,
giving 0E19 + 0E4D + 0E49 + 0E32. As before the extra letter-space is then
inserted before the 0E32:  .  However, UAX29 treats both these words
( and ) as a single extended grapheme cluster, and so CSS Text would
not properly insert the extra letter-space.

There is a Thai book with a clear example here:

  http://ftp.opentle.org/pub/national-fonts/FONTBOOK.PDF

If you look at the first line of page 4, you will see ӹ spaced as  
 . CSS Text, as currently spec'ed would, letter-space this as  .

Lao is almost certain to be the same.  The question then arises of whether
there are any other scripts where CSS Text does the wrong thing. The cases
I wonder about are where you have an extended grapheme cluster that
includes a dependent vowel represented by a spacing combining mark, and

- the mark sits on the baseline like consonants;
- the mark glyph is of a similar height to the consonant glyphs; and
- there is a gap between the mark glyph and the base glyph it is applied to.

In such a case, if you don't add letter spacing between the mark and the
base, it seems to me that the spacing of letter-spaced text will appear
uneven.  The question then is whether the typographical tradition for a
particular script gives more weight to making spacing even or preserving
the perceived unity of the grapheme cluster (if the typographical tradition
uses letter-spacing in the first place).

Actually, there's also an example of the latter in Thai also: with SARA AE
(which is graphically identical to a double SARA E), letter spacing does
not traditionally affect the spacing between the two components of the SARA
AE glyph (there's an example in the above mentioned book on the first page).

James
Received on Sunday, 22 September 2013 05:38:40 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:50:57 UTC