Re: Feedback Required on CSS 2.1 & CSS 3.0 Issues in Indian Languages from John Hudson on 2013-07-11 (public-i18n-indic@w3.org from July to September 2013)

From: John Hudson <tiro@tiro.com>
Date: Wed, 10 Jul 2013 17:27:48 -0700
CC: public-i18n-indic@w3.org, Andrew Glass <Andrew.Glass@microsoft.com>
Message-ID: <51DDFC04.3010001@tiro.com>

On 10/07/13 5:03 PM, Andrew Cunningham wrote:

> At the moment I'm slowly gathering information on various aspects of web
> typography and typesetting with Myanmar script, including how it should
> work, and existing workarounds ... I'll be in Myanmar in November and
> hope to progress it much further by then.

> But will compile some information on S'gaw versus Burmese syllables over
> next couple of days.

That would be very helpful. I'm cc'ing Andrew Glass, with whom I worked 
on the Myanmar Text font for Microsoft. This shipped with Windows 8, 
along with Burmese script support, and MS have published their Myanmar 
font and layout spec:
http://www.microsoft.com/typography/OpenTypeDev/myanmar/intro.htm

The 'Shaping Engine' section provides a good overview of the kind of 
analysis performed to identify orthographic syllables (what the text 
refers to as 'syllable clusters').

[Subscribers to this list who are more strictly focused on Indic scripts 
might want to refer to the Devanagari or individual script specs 
available here:
http://www.microsoft.com/typography/SpecificationsOverview.mspx ]

In the Myanmar case, the layout engine cluster rules allow for massive 
complexity, beyond what actually occurs in normal writing. That is, the 
cluster model describes everything that can happen in the script to form 
a valid cluster, only parts of which occur in natural language clusters. 
How this model of orthographic syllable analysis adapts to something 
like CSS 'first-letter' selectors isn't immediately clear to me, but 
conceptually it seems quite close: if you can accurately identify 
individual orthographic syllables for glyph processing, then you can 
identify them for other kinds of display considerations (separate 
colouring, scaling, annotation, etc.).

There is, by the way, at least one open source OpenType text shaping 
engine, Harfbuzz, which implements Indic shaping.*
http://www.freedesktop.org/wiki/Software/HarfBuzz/

It seems likely that the syllable cluster analysis code in Harfbuzz 
would provide a model for improved CSS selector behaviour for Indic scripts.

JH

*In some respects more successfully than Microsoft and Adobe's engines, 
in my opinion. I have recently tested a Harfbuzz build that addresses 
the issues raised in this white paper:
http://www.tiro.com/John/Problems_for_Indic_Typography.pdf

Received on Thursday, 11 July 2013 00:28:23 UTC