FVS for NA (was: FVS Assignment MisMatch)

On Mon, 3 Aug 2015 09:40:13 +0200
Erdenechimeg Myatav <erdeely@gmail.com> wrote:

> ---------- Forwarded message ----------
> From: Erdenechimeg Myatav <erdeely@gmail.com>
> Date: Sun, Aug 2, 2015 at 9:16 PM
> Subject: Re: New Thread - FVS Assignment MisMatch
> To: Greg Eck <greck@postone.net>
> 
> 
> Hi Greg,
> 
> A few comments on some of the issues you raise and some of the
> discrepancies between the various fonts.
> 
> U+1820 - A
> 
> Re point 3, the glyph at I+FVS1 (and M+FVS2 in NSM font)  is only used
> after NNBSP, i.e. it is always the first letter of a word suffix/case.
> Teachers of Mongolian script always refer to this as initial form,
> which is the basis for the coding at I+FVS1. However, in the Unicode
> standard the decision was made to code it as a middle form (basically
> because it appears in the middle of the word), which is why it
> appears at M+FVS2 in some fonts.
> 
> I think the M+FVS2 combination could be omitted, as is done in BS.
> 
> U+1828 - NA

Please point me to the relevant discussion if I have missed an
important discussion about shaping.

My understanding is that the dotting of NA is controlled by its
context.  The rules seem to be slightly complicated, but they appeared
to be as follows:

1) Initial NA has a dot unless it is followed by a consonant.  (The
example I saw was the surprising spelling of the Canotonese name Ng,
with NA and, I think, GA.)

2) Medial NA has a dot before a Mongolian vowel, and not otherwise.

3) Before MVS, NA has a final form, but with a dot.

4) The final form has no dot.

(I don't know the rule before NNBSP.)

I believe the rule is that it is dotted if the next base
character is a Mongolian script vowel, and is undotted otherwise.

> Regarding the forms at M and M+FVS1, the dotted form is used before a
> vowel, whereas the non-dotted form is used before a consonant. The
> letter NA is almost always followed by a vowel, so the dotted form is
> by far the more common. So I think it makes most sense for the dotted
> form to be the basic form and the non-dotted form to be the variant,
> as in NSM and BS.

I believe the rules above cover medial NA within words.  What
exceptions, if any, are there?  The Uyghur script apparently normally
dotted NA; does this have the consequence that in old writings one
might occasionally encounter words with medial NA dotted before a
consonant?

Now, if one writes medial NA in isolation - <ZWJ, NA, ZWJ>, by Rule 2
there will be no dot.  To force the dot, we then need a variation
selector. M + FVS1 does the required job.

> The glyph at M+FVS2 was defined as a middle form in Unicode for the
> same reason as given above for U+1820-A - i.e. because it appears in
> the middle of the word. However, this form only occurs immediately
> before MVS, and as such teachers of Mongolian script refer to it as
> the "dotted final form". So in my view it makes more sense for it to
> be coded as F+FVS1 rather than as M+FVS2.

I believe there is text in which <ZWJ, NA, FVS2, ZWJ> or, archaically,
<ZWJ, NA, ZWJ, FVS2> is used to force the display of 'final NA' with a
dot.  Moreover, this combination works with most fonts.  We should
therefore allow the existing definition of M+FVS2, as well as the new
logic, which approximately is:

<ZWNJ, NA, FVS1, ZWJ> (undotted) is the opposite of <ZWNJ, NA, ZWJ>
(dotted).

<ZWJ,  NA, FVS1, ZWJ> (dotted) is the opposite of <ZWJ, NA, ZWJ>
(undotted). 

<ZWJ,  NA, FVS1, ZWNJ> (dotted) is the opposite of <ZWJ, NA, ZWNJ>
(undotted). 

(I am assuming the context has no effect; I presume <NA, ZWJ, vowel>
renders the same as <NA, vowel>.)

Spoofing would only be an issue if variation selectors were allowed in
domain names; currently, they're banned.

Richard.

Received on Thursday, 6 August 2015 00:17:33 UTC