- From: 신정식 <jshin1987@gmail.com>
- Date: Sun, 10 Oct 2010 22:22:23 -0700
- To: Ed <ed.trager@gmail.com>
- Cc: fantasai <fantasai.lists@inkedblade.net>, Cibu Johny <cibu@google.com>, Somnath Chandra <schandra@mit.gov.in>, style <www-style@w3.org>, wwwintl <www-international@w3.org>, intlcore <public-i18n-core@w3.org>, indic <public-i18n-indic@w3.org>, Richard Ishida <ishida@w3.org>, Andrew Cunningham <lang.support@gmail.com>
- Message-ID: <AANLkTi=F7cMSEhZUya=ckvT2bJvRgny7d6pfTNeAPB=F@mail.gmail.com>
On Sat, Oct 9, 2010 at 3:55 AM, Ed <ed.trager@gmail.com> wrote: > I agree with Andrew: there needs to be wording making it clear that > the first-letter pseudo-element applies to the first grapheme cluster. > I fully agree. > This will be true not only for the Indic scripts, but also for > Indic-derived scripts of Southeast Asia like Thai, Laos, Myanmar, > Khmer, and Tai Tham, inter alia. > Well, it's not limited to South and SE Asian scripts. Even Latin/Cyrillic/Greek and Korean scripts need the same treatment either when decomposed forms are used (although W3C CHARMOD assumes NFC, there's nothing to prevent web authors from using decomposed forms) or characters/letters in question can only be represented with multiple unicode characters (usually base + diacritics, but not always as is the case of archaic Korean). And, it also has to be applied to Hebrew, Arabic, Syriac and Thaana. As for Indic scripts, we need to agree on what makes up a grapheme cluster (when implementing 'first-letter'). Below is what UAX #29 has to say about that: Grapheme clusters can be tailored to meet further requirements. Such tailoring is permitted, but the possible rules are outside of the scope of this document. One example of such a tailoring would be for the *aksaras*, or *orthographic syllables*, used in many Indic scripts. Aksaras usually consist of a consonant, sometimes with an inherent vowel and sometimes followed by an explicit, dependent vowel whose rendering may end up on any side of the consonant letter base. Extended grapheme clusters include such simple combinations. However, aksaras may also include one or more additional prefixed consonants, typically with a *virama* (halant) character between each consonant in the sequence. Such consonant cluster aksaras are not incorporated into the default rules for extended grapheme clusters, in part because not all such sequences are considered to be single "characters" by users. Indic scripts vary considerably in how they handle the rendering of such aksaras—in some cases stacking them up into combined forms known as consonant conjuncts, and in other cases stringing them out horizontally, with visible renditions of the halant on each consonant in the sequence. There is even greater variability in how the typical liquid consonants (or "medials"), *ya, ra, la,* and *wa*, are handled for display in combinations in aksaras. So tailorings for aksaras may need to be script-, language-, font-, or context-specific to be useful. *Note: Font-based information may be required to determine the appropriate unit to use for UI purposes, such as identification of boundaries for first-letter paragraph styling. For example, such a unit could be a ligature formed of two grapheme clusters, such as لا (Arabic * The Unicode definitions of grapheme clusters are defaults: not meant to exclude the use of more sophisticated definitions of tailored grapheme clusters where appropriate. Such definitions may more precisely match the user expectations within individual languages for given processes. For example, “ch” may be considered a grapheme cluster in Slovak, for processes such as collation. The default definitions are, however, designed to provide a much more accurate match to overall user expectations for what the user perceives of as *characters* than is provided by individual Unicode code points. Jungshik > It will still be a very long time before browsers actually provide > adequate support: but in any case it will be very nice if the specs > have adequate wording so implementors will have a better clue about > what might actually be required to support complex scripts. > > On Sat, Oct 9, 2010 at 3:47 AM, Andrew Cunningham <andrewc@vicnet.net.au> > wrote: > > > > On Sat, October 9, 2010 13:11, fantasai wrote: > > > On 10/08/2010 10:26 AM, Cibu Johny (സിബൠ) wrote: > > > > > > > > No part of the document gives me enough information to implement > > > any of > > > - first-letter > > > > this may be a browser bug issue as well. > > > > CSS3 Selectors module has a note that ::first-letter pseudo-element > should > > at least apply to the default grapheme cluster. > > > > Maybe rather than as a note , this might be included using stronger > wording? > > > > > > > > -- > > Andrew Cunningham > > Research and Development Coordinator > > Vicnet > > State Library of Victoria > > Australia > > > > andrewc@vicnet.net.au > > > > > >
Received on Monday, 11 October 2010 06:48:09 UTC