W3C home > Mailing lists > Public > public-i18n-mongolian@w3.org > July to September 2015

RE: New Thread - FVS Assignment MisMatch

From: Greg Eck <greck@postone.net>
Date: Thu, 6 Aug 2015 16:12:55 +0000
To: Erdenechimeg Myatav <erdeely@gmail.com>
CC: "public-i18n-mongolian@w3.org" <public-i18n-mongolian@w3.org>
Message-ID: <BN3PR10MB0321B05932DFCFE5688A93E2AF740@BN3PR10MB0321.namprd10.prod.outlook.com>
Hi Erdenechimeg,

Sorry for taking so long to make my comment.
This is an excellent write-up and follows just what we are saying in the MVS/NNBSP model.
I have taken the liberty to add in your comments made later on the 1835/1836.
I have made some comments below in yellow.
Again, thank you for the detailed write-up that you gave us.
It will go a long way in helping the UTC to understand the situation more clearly.

Greg



From: Erdenechimeg Myatav [mailto:erdeely@gmail.com]
Sent: Monday, August 3, 2015 3:17 AM
To: Greg Eck <greck@postone.net>
Subject: Re: New Thread - FVS Assignment MisMatch

Hi Greg,

A few comments on some of the issues you raise and some of the discrepancies between the various fonts.

U+1820 - A

Re point 3, the glyph at I+FVS1 (and M+FVS2 in NSM font)  is only used after NNBSP, i.e. it is always the first letter of a word suffix/case. Teachers of Mongolian script always refer to this as initial form, which is the basis for the coding at I+FVS1. However, in the Unicode standard the decision was made to code it as a middle form (basically because it appears in the middle of the word), which is why it appears at M+FVS2 in some fonts.

I think the M+FVS2 combination could be omitted, as is done in BS.

We can pass this situation on to the UTC, make a recommendation for change as fonts are split in implementation, and see what they say. Basis for the change is that 5 out of 6 fonts implement the initial A of the ACA suffix as an initial+FVS1. If they say no, then we accept it, make a notation in the StandardizedVariants.xxx document and go on. Suggested change is to delete the current <U+1820-Medial,U+180C> and add the <U+1820-Initial,U+180B> specification.

U+1828 - NA

The glyph at M+FVS2 was defined as a middle form in Unicode for the same reason as given above for U+1820-A - i.e. because it appears in the middle of the word. However, this form only occurs immediately before MVS, and as such teachers of Mongolian script refer to it as the "dotted final form". So in my view it makes more sense for it to be coded as F+FVS1 rather than as M+FVS2.

A further problem is that which Siqin in bringing in with the need for an undotted 1828 at the medial which can over-ride the default. This of course suggests a new FVS assignment at 1828-Medial. This is the contention for space at the 1828-Medial position as mentioned in my earlier article.

Current specification / Suggested specification is
<U+1828-Medial,NoFVS> (medial undotted)                      Unchanged
<U+1828-Medial,U+180B> (medial dotted)                         Unchanged
<U+1828-Medial,U+180C> (final dotted)                            <U+1828-Medial,U+180C> (medial un-dotted)
<U+1828-Medial,U+180D> (Todo medial dotted)               Unchanged
<U+1828-Final,NoFVS> (final undotted)                             Unchanged
                                                                                                <U+1828-Final,U+180B> (final dotted)

We can pass this situation on to the UTC and make a recommendation for the change. Suggested change is listed above. The other option would be to add FVS4 which is not so desire-able.

Regarding the forms at M and M+FVS1, the dotted form is used before a vowel, whereas the non-dotted form is used before a consonant. The letter NA is almost always followed by a vowel, so the dotted form is by far the more common. So I think it makes most sense for the dotted form to be the basic form and the non-dotted form to be the variant, as in NSM and BS.

I whole-heartedly agree with the dotted form being the reasonable default. However, given the history of font development, I think we have no choice but to leave this specification as it is. I know of other fonts not represented here who are using the un-dotted form as the default also. I realize that this  makes it difficult for NSM and BS to consider the change because of the work involved in switching the default, but it is probably one of the hard calls that we have to make.

U+182C - QA

As far as I understand, the dotted forms are archaic forms and are not used in modern Mongolian. So it certainly makes sense for the non-dotted forms to be the default forms.

Yes.

Regarding the feminine forms at I+FVS1 and M+FVS1 in NSM and BS, the feminine form always forms a ligature with the following vowel. But there is (at least) one word in which the feminine initial QA is followed by a consonant - in fact by another QA in the example I can think of: U+182C U+182C U+1822 U+1837 (XXIR) - and in this word the glyph for the initial QA is the one shown at I+FVS1 in NSM and BS.

We should also pass this situation on to the UTC, make a recommendation for change as fonts are unanimous in implementation at the final position, and see what they say. If they say no, then we accept it, make a notation in the StandardizedVariants.xxx document and go on. Suggested change is to delete the current specifications <U+182C-Medial,U+180C>/<U+182C-Medial,U+180D> and add the two replacement specifications <U+182C-Final,NoFVS> / <U+182C-Final,U+180B>.

U+182D - GA

Re point 1, this double dotted glyph only occurs immediately before MVS (analogously to the dotted glyph of NA; see comment above), and again teachers of Mongolian script refer to it as a final form rather than a middle form. So I think it makes more sense for it to be represented as a final form variant rather than a middle form variant.

There is a space contention issue here that I have not been able to put together here yet. Does anyone have time to work on this one? I count 5 forms here including an over-ride FVSx for the default. We have space for four. I think we are going to be forced to move one medial form to the final slot – and that makes it reasonable to place the final form that is currently in a medial slot to move into a final slot..

Regarding the forms at M and M+FVS1, as for NA the dotted form is used before a vowel, whereas the non-dotted form is used before a consonant and GA is almost always followed by a vowel, so the dotted form is by far the more common. So again I think it makes most sense for the dotted form to be the basic form and the non-dotted form to be the variant, as in NSM and BS.

Again, I agree with this in principle. History has already set this one in place however. This position should remain as it is with the original specification of the undotted GA as the default. Again, it is a hard call for NSM and BS.

U+1835 - JA

I can only think of one word which includes the
'loop' variant glyph, viz. \u182A\u1824\u1822\u202F\u1835\u180E\u1820
ᠪᠤᠢ ᠵ᠋ ᠠ ( (bui_j-a)  - I attach a picture of the written version)

Here again the J precedes MVS so this form would generally be referred to as a final form in teaching.

We can pass this situation on to the UTC also, make a recommendation for change as fonts are split in implementation, and see what they say. If they say no, then we accept it, make a notation in the StandardizedVariants.xxx document and go on. Suggested change is to delete the current <U+1835-Medial,U+180B> and add the <U+1835-Final,U+180B> specification.

U+1836 - YA

Here there are a number of words which include the 'loop' variant, for example

\u1826\u1836\u180E\u1821
\u1836\u1820\u182A\u1824\u1836\u180E\u1820
\u1835\u1820\u1836\u180E\u1820
ᠦᠶ ᠡ᠋ ᠂ ᠶᠠᠪᠤᠶ ᠠ᠂ ᠵᠠᠶ ᠠ ( (үе, uy-e ; явяа, yabvy-a; заяа, jay-a)   picture attached)

But the loop variant glyph is always immediately followed by MVS, so again the loop variant would generally be referred to as a final form in teaching.

We should also pass this situation on to the UTC, make a recommendation for change as fonts are unanimous in implementation at the final position, and see what they say. If they say no, then we accept it, make a notation in the StandardizedVariants.xxx document and go on. Suggested change is to delete the current specifications <U+1836-Medial,U+180C> and add <U+1836-Final,NoFVS>.

Thanks,
Erdenechimeg

bui_j-a.JPG
(image/jpeg attachment: bui_j-a.JPG)

yabui-a.JPG
(image/jpeg attachment: yabui-a.JPG)

Received on Thursday, 6 August 2015 16:14:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:07:04 UTC