Re: Reference Scheme for Mongolian Rendering

On Fri, 14 Aug 2015 11:54:09 +0100
Andrew West <> wrote:

> On 14 August 2015 at 11:10, Greg Eck <> wrote:
> >
> > 2.) I am not sure I would say "normally".

What I wrote, "This is normally implemented via the cmap table in
OpenType fonts.", is literally true.  Perhaps someone can find a less
misunderstandable way of saying one normally uses the 'cmap' table in
OpenType fonts. One might do it via a 'cmap' in an AAT font, or
conceivably via some Type 1 font mechanism.

> > Baiti and probably most
> > of the other fonts use direct OT substitutions.

The 'cmap' table is used to generate the input for the substitutions.

> > I have experimented
> > with the Type 14 Encoding Table successfully, but don't know that
> > it is needed in every case.

I didn't say that it was needed.  Paragraph 2 says, "The result of
using a font should be the same as if the font followed the scheme."
That clearly needs greater emphasis.

When testing the HarfBuzz implementation of Mongolian shaping, Behdad
Esfahbod was surprised that Type 14 look-ups weren't being used.

> Personally, I do not think it makes much sense to use the Format 14
> subtable for Mongolian FVS.  As the final glyph is determined by
> context, you would have to substitute a special temporary glyph (one
> unique glyph for each base character + FVS combination) in the Format
> 14 table, and still apply all the contextual substitution rules in the
> GSUB table to transform the temporary glyph to the correct glyph.
> Adding the Format 14 table has no advantages that I can see, and
> seriously complicates the process (having implemented Format 14 for my
> Phags-pa fonts I can say that I really loathe Format 14, and think it
> was an awful mistake).

I think it simplifies the expressions of the rules.  Having worked out
the sequence of susbstitutions, one can then merge glyph
identities. How easy that is depends on one's font compiler.  Mine
allows synonyms for glyphs.

For example, one could express the transforms applied to U+1820 A as
follows.  First, note that prior to the stage at which obligatory
ligatures are considered, U+1820 has 8 glyphs, which we may name

gA0s gA0i gA0m gA0f   gA1s gA1i gA1m gA1f

(s = solitary / i*s*olated, i = *i*nitial, m = *m*edial, f = *f*inal) 

I believe that we should not invalidate displays containing <ZWJ, A,
FVS2, ZWJ>, so I would support that combination and set up a synonym
gA2m of the gA1i.

For Stage B:

For the cmap, I would have the mapping:

    1820 180B > gA1m

    1820 180C > gA2m # As Unicode 8.0.0.  Remember that gA2m = gA1i

    1820 180D > gA0s # Invalid variation selector. Redundant

    1820      > gA0s # Show the expected shape to character pickers

For Stage C:

The isol feature would implement:

    gA1m > gA1s

    gA2m > gA0s # Invalid variation selector

The init feature would implement 

    gA0s > gA0i

    gA1m > gA1i # For Unicode 8.0.0, this should be to gA0i

    gA2m > gA0i # Invalid VS

The medi feature would implement:

    gA0s > gA0m # The price of a nice pick-list.

The fini feature would implement:

    gA0s > gA0f

    gA1m > gA1f

    gA2m > gA0f # Invalid VS

Stage D:

Some invoked feature (perhaps rclt) would implement:

    gA0s > gA1f / gMVS _

I can use gA1f in the Stage D rule because it will not be further
modified in Stage D.

I don't think any of these glyphs can be called temporary.

A test input to ponder is <MVS, A, FVS1>.  How should that be
rendered?  Unicode 8.0.0 calls for <gMVS, gA1s>.  

U+1829 ANG would not be mentioned in the Type 14 table; it does not
take variation selectors.

> I seem to recall that Peter Constable once (on the unicore list?)
> stated that Format 14 was not intended for use with Mongolian (and
> certainly Microsoft has not implemented Format 14 for its Mongolian
> font), but I cannot find this email any longer.

Nor, alas, can I.


Received on Friday, 14 August 2015 19:21:55 UTC