RE: Mongolian NNBSP [I18N-ACTION-458]

Hi Richard,

Thanks for the reference to WordBreakProperty.txt. I understand it's function much better now.

I am attaching a file from some of professor Quejingzhabu's notes dealing with the U+1885-Baluda and the U+1886-TripleBaluda.
The instruction from the professor is to attach the baluda glyphs to the right side (vertical) of the top of the word.
The image is admittedly to be interpreted, but it is all that I have - I have never seen an original source image.
I don't know of any font that has successfully implemented the baluda as the professor describes it's shaping behavior.
As I see it, we implement either baluda as a diacritic on the right side of the stem just as the U+18A9-Dagalga is attached on the left.
Comments are welcome.

Most Mongolian suffixes have both a masculine form and a feminine form to allow for vowel harmony.
Some suffixes have T/D-initial variants allowing for a match between the stem and suffix.
Some have a vowel-initial form and a Y+vowel initial form, again to allow for  an appropriate match between stem and suffix.
I imagine the import of gender runs all the way from initial rendering up the chain to upper-level layout processing such as spell-checkers.
Badral's work with spell-checking would probably give special insight here.


-----Original Message-----
From: Richard Wordingham []
Sent: Saturday, August 1, 2015 6:15 PM
Subject: Re: Mongolian NNBSP [I18N-ACTION-458]

On Sat, 1 Aug 2015 09:33:36 +0000
Greg Eck <<>> wrote:

> If we went with the ExtendNumLet, is there a normative file that holds
> the data values - or are these changes that need to be implemented by
> the rendering engines themselves - such as MS Universal Shaping Engine
> or Harfbuzz?

There is a normative file, auxiliary/WordBreakProperty.txt in the Unicode Character Database.

The Word_Break property should have nothing to do with rendering, and I would not expect the rendering engines to pay any attention to them.
Is this a problem for gender-sensitive shaping?

The Word_Break property is for splitting text into word, such as advancing through text word-by-word and spell-checking. I get the impression that the Mongolian script is one where there is great need of a spell-checker, because with a normal font one can't check what one wrote.


Received on Saturday, 1 August 2015 13:49:22 UTC