Re: Role of the font in deciding the cluster boundaries

One of my concerns is whether some features should be applied to grapheme
clusters or to syllables.

If you start looking at SE Asian Indic scripts things become complicated.
The presference in Burmese seems to be for syllable boundaries, where a
syllable can be one or more grapheme clusters.

More complicated when kinzi is involved since syllable boundary would be
within the grapheme cluster.
On 13/03/2014 5:46 PM, "Richard Ishida" <ishida@w3.org> wrote:

> In 13/03/2014 06:14, Cibu Johny (സിബു) wrote:
>
>> One quick feedback on the regular expression approach to decide a
>> grapheme cluster. In many Indic scripts, whether to display a sequence
>> contianing <..consonant, virama, consonant..> as a single cluster or to
>> split them with explicit visible virama is font dependent.
>>
>> For example, in Malayalam, sequence S-KHA (സ്ഖ) would be displayed with
>> with explicit virama in a reformed script font and as a single unit in
>> traditional script font.
>>
>
> I think the important question is whether the whole conjunct should
> continue to be treated as a unit for first-letter styling, line breaking,
> vertical arrangements, etc, whether or not the conjunct is expressed using
> a visible virama (actually, in fact, whether the orthographic syllable
> continues to be the unit, since it may also include vowel signs and such).
>
> Are there any cases where it would not?
>
> RI
>
>

Received on Thursday, 13 March 2014 08:48:53 UTC