- From: John Hudson <tiro@tiro.com>
- Date: Mon, 21 Jul 2014 14:27:40 -0700
- To: Hariraam <hariraama@gmail.com>
- CC: indic <public-i18n-indic@w3.org>
Hariraam wrote: > "if instead of using U+094E for left-side kana part of ikar, > using a private glyph (not encoded) for it and using top-cap of ikar > after the base consonants, Let's think this through. It might be a useful exercise. You have a syllable that ends with a short i vowel. That means you want to have an ikar character (U+093F) encoded in the text string after the consonant(s) [and before optional anusvara]. Any other location for this character or any other character would not be a correct encoding of the vowel in this syllable. That U+093F character has to map to a glyph in the font cmap table. If it doesn't then software will display a .notdef glyph, which is the standard indication that a character lacks a cmap entry. The glyph mapped to U+093F in the cmap table, no matter what that glyph is (full ikar, part of an ikar, just the stem, just the hook, whatever) is going to be reordered leftward to the beginning of the syllable [or until it encounters an explicit halant, but let's ignore that case for simplicity's sake] after cluster segmentation and application of basic shaping features. The reordering has to happen at this stage because until the cluster has been shaped the final location of the ikar cannot be determined, but the reordering of the ikar glyph is not triggered by a feature in the font but by association with the character code U+093F, which is a left vowel sign. It seems to me that here is your fundamental problem: that any instance of U+093F will result in a reordered glyph whenever a shaping engine is active, and there's no reliable means to provide for clean encoding of text that will work both when a shaping engine is active and when it isn't. The insertion of a left-side stem glyph is not the central issue (although I wonder how a 'private glyph (not encoded)' would be inserted at the desired location). The normative reordering of the ikar as encoded is. I can imagine several very complex and convoluted workarounds using OpenType GSUB lookups to handle reordering at the glyph level (essentially attempting to express the cluster model at the glyph lookup context string level), but the OpenType Layout support necessary to apply such workarounds is now more than likely to be accompanied by an Indic shaping engine. Where a shaping engine is active, it is very difficult to bypass, because the first activity of such an engine is script itemisation. The only way to have Devanagari text avoid being passed to a Devanagari-capable shaping engine is to not encode it as Devanagari, which defeats the whole purpose and benefit of Unicode. It seems to me that you are trying to come up with a solution to display Devanagari text in a (thankfully) diminishing technological circumstance. JH
Received on Monday, 21 July 2014 21:28:17 UTC