Re: css3-text- Indic Inputs

On 11 October 2010 16:22, Jungshik SHIN (신정식) <> wrote:
> On Sat, Oct 9, 2010 at 3:55 AM, Ed <> wrote:

>>  This will be true not only for the Indic scripts, but also for
>> Indic-derived scripts of Southeast Asia like Thai, Laos, Myanmar,
>> Khmer, and Tai Tham, inter alia.
> Well, it's not limited to South and SE Asian scripts. Even
> Latin/Cyrillic/Greek and Korean scripts need the same treatment either when
> decomposed forms are used (although W3C CHARMOD assumes NFC, there's nothing
> to prevent web authors from using decomposed forms) or characters/letters in
> question can only be represented with multiple unicode characters (usually
> base + diacritics, but not always as is the case of archaic Korean).  And,
> it also has to be applied to Hebrew, Arabic, Syriac and Thaana.
> As for Indic scripts, we need to agree on what makes up a grapheme cluster
> (when implementing 'first-letter'). Below is what UAX #29 has to say about
> that:

There are many Latin script languages that require using one or more
combining diacritics even in NFC.

One problem with Latin script languages, will be languages that have
digraphs as part of the alphabet. If typesetting Dinka or Nuer in
print, and wanting a different style or presentation for the first
letter, I'd e thinking in terms of letters of the alphabet (which
includes digraphs).

Not sure how that would translate to a CSS rule. For instance the word
for Dinka in the Dinka language starts with the letter "Th", not the
letter "T" which is a different letter.

Wether such distinctions can or should be included .... not sure ...


Andrew Cunningham
Senior Project Manager, Research and Development
State Library of Victoria

Received on Monday, 11 October 2010 05:51:47 UTC