- From: Richard Ishida <ishida@w3.org>
- Date: Mon, 26 Jan 2015 14:16:05 +0000
- To: indic <public-i18n-indic@w3.org>
5.1 First Letter
http://www.w3.org/TR/2014/WD-ilreq-20141216/#first-letter
"Note how the vowel sign appears to the left of the first character, not
the third. There are three grapheme clusters here. The first includes
the SA+VIRAMA,THA+I and T+II. We see that the styling is done on the
basis of the syllable, not the first character. A syllable includes a
base consonant and any combination of the following characters in the
text stream:"
This text is misleading when paired with figure 4 when it talks about 3
graphemes and there are 3 red circles. It also doesn't show first letter
styling, as the text says, which is confusing. There is also an error in
the romanization.
How about the following wording, based around the example at
https://www.flickr.com/photos/ishida/16084553630/
I also suggest renaming the section to Initial Letter Styling, to match
the CSS Inline spec
---------
Indic script behavior in initial letter styling is based on syllables,
rather than individual letter forms.
Figure 4 shows an example of a drop intial in Hindi. In the first word
of the paragraph, स्कूल ('skūl'), the sequence of characters is stored in
memory is as follows:
स U+0938 DEVANAGARI LETTER SA
् U+094D DEVANAGARI SIGN VIRAMA
क U+0915 DEVANAGARI LETTER KA
ू U+0942 DEVANAGARI VOWEL SIGN UU
ल U+0932 DEVANAGARI LETTER LA
There are two syllables in this word: SA+VIRAMA+KA+UU and LA. Note,
however, that there are three Unicode grapheme clusters here: SA+VIRAMA,
KA+UU and LA.
Styling is done on the basis of the whole orthographic syllable, not the
first character, nor even the first grapheme.
A syllable includes a base consonant and any combination of the
following characters in the text stream:
- sequences of consonants preceded by virama (i.e. conjuncts).
- vowel signs
- visarga, anusvara or candrabindu.
NOTE: The detailed definition of Indic syllables is given in section 2.
Here are some further examples of initial letter styling based on the
Indic syllable definition.
...
---------
An alternative would be to take the above text and put it at the bottom
of section 3 Text Segmentation, as an illustration of the point made in
the last paragraph ("text segmentation should be done as Indic
syllable"). This is useful because it clearly distinguishes between
grapheme cluster and syllabic units, and could be referred to from other
sections, too, such as the section on vertical text.
And then simply say, at the start of section 5.1 that selection of
initial letters uses the orthographic syllable as the unit, as
illustrated in section 2, and then simply give some examples. The
majority of section 5.1 could then focus on more specific requirements,
such as what styles of highlighting are common, and what the alignment
points, etc, are.
Received on Monday, 26 January 2015 14:16:13 UTC