- From: John Hudson <tiro@tiro.com>
- Date: Thu, 12 Sep 2013 11:56:19 -0700
- CC: W3C Style <www-style@w3.org>, www International <www-international@w3.org>
Richard Ishida wrote: > Or is the meaning that if the font has a glyph for the precomposed > character that is canonically equivalent to the sequence of characters, > then that glyph should be used (without changing the sequence of > characters itself). That would seem to make more sense. Yes, that does make more sense, and should probably be spelled out. It is also what at least some layout engines do regularly. MS Uniscribe will perform a cmap check for a precomposed glyph representing a canonical composition of a cluster, and use that glyph in preference to the decomposed glyph sequence. The reasoning for this is that a) many fonts may support the precomposed character but not have GPOS mark positioning (especially true for European diacritic characters in a huge number of fonts), and b) character level substitution is faster than glyph level GSUB composition. I presume the same operations would apply directly in the CSS cluster matching model. [Because of such layout engine operations, on the font side the OpenType Layout tables are generally built around an assumption of buffered NFC-like input from the cmap, regardless of the original text string. This means, of course, that in some fonts <ccmp> will be used to decompose the initial glyph strings that the layout engine has composed at the cmap level from originally decomposed character strings -- thereby demolishing the presumed time saving of the cmap composition operation. That's a choice the font developer makes based on whether he or she wants to work, during glyph processing, with precomposed diacritic glyphs, decomposed bases pus marks, or -- most awkwardly -- a mix of the two.] JH
Received on Thursday, 12 September 2013 18:57:00 UTC