[iip] Letter-spacing splits conjuncts (#117)

r12a has just created a new issue for https://github.com/w3c/iip:

== Letter-spacing splits conjuncts ==
<i class="meta">This issue is applicable to most languages that form conjuncts from consonant clusters using an invisible virama.</i>

A consonant cluster that uses a conjunct (rather than visible virama) should not be split when letter-spacing is applied.

Relying on grapheme clusters as the main segmentation approach fails here because conjuncts are composed of multiple grapheme clusters, and should be kept together as a unit.

For these situations it is necessary to tailor the segmentation algorithm, so that it recognises the whole consonant cluster plus any attached vowel-signs or combining characters as a single unit.

For examples  see [Typographic character units in complex scripts](https://www.w3.org/International/questions/qa-indic-graphemes).



<b class="subhead">Specs:</b>

[css-text-3](https://drafts.csswg.org/css-text-3/#typographic-character-unit) CSS uses the concept of <a href="https://drafts.csswg.org/css-text-3/#typographic-character-unit">'typographic character unit'</a>, rather than grapheme cluster, in its specs with the explanation that the cases just described go beyond the scope of the grapheme cluster concept and that implementations should provide appropriate support. The spec doesn't provide details about the support needed for each language.

The Unicode Consortium made some attempts to address this issue, but it has so far not yielded results.  CLDR now flags up a few scripts for which conjuncts are common.


<b class="subhead">Tests & results:</b>
<i>Interactive test</i>, [When letter-spacing is applied to Devanagari the browser will not split conjuncts](https://github.com/w3c/line_paragraph_tests/issues/73)<br>
<i>Interactive test</i>, [When letter-spacing is applied to Bengali the browser will not split conjuncts)<br>
<span class="fail">Gecko</span>, <span class="pass">Blink</span>, and <span class="pass">Webkit</span> all produce the same result. Most of the half-form conjuncts (which is the large majority of all conjuncts) have space inserted between the glyphs that make up the conjunct (ie. not split into consonants with visible viramas). Vertically-combined glyphs tend not to be split.




<b class="subhead">Priority:</b>
Keeping conjuncts together is a pretty basic requirement.  It is not possible to work around this problem.

That said, letter-spacing is not relied on for essential content authoring, therefore the priority was set to advanced.


Please view or discuss this issue at https://github.com/w3c/iip/issues/117 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 31 March 2021 12:50:06 UTC