RE: New Thread - FVS Assignment MisMatch from jrmt@almas.co.jp on 2015-08-02 (public-i18n-mongolian@w3.org from July to September 2015)

From: <jrmt@almas.co.jp>
Date: Mon, 3 Aug 2015 05:52:01 +0900
To: "'Richard Wordingham'" <richard.wordingham@ntlworld.com>, <public-i18n-mongolian@w3.org>
Message-ID: <000601d0cd65$150a1e40$3f1e5ac0$@almas.co.jp>
Dear Mr. Richard,

> No!  The Unicode editing committee tried to chose a form that was unique
to a particular character.  
> That does not make it the appropriate default isolate.  
We have done that in the Unicode Encoding Chart. The U1800 exactly selected
the different display form of each character.
What is the problem ? what I am saying here is we will follow the Unicode
Encoding chart U1800.pdf to select the default isolate variant form.

> Remember, the basic character charts are not normative; they merely serve
to tell the reader which character has a particular code.  
> This can fail spectacularly when characters are distinguished by their
sound rather than their shapes.  
> (There are also a few Korean Chinese compatibility characters that are
principally distinguished by sound.)
I have raised this problem in 1999's, when the Mongolian Proposal prepare
stage. 
But the WG2 lead us to come to current version of Mongolian Unicode chart. 
I remember that the Mongolian each character have different shape in Unicode
basic character chart.
But do you know, how many undistinguishable word exactly in Mongolian ?
According to our approximately statistic, 
there are almost 80% of the word have more than two spelling in current
Mongolian Unicode encoding.
We have no other selection, we have to use current version of the Unicode
Mongolian.

> A similar example is the pairs U+0061 LATIN SMALL LETTER A and
> U+0251 LATIN SMALL LETTER ALPHA and U+0067 LATIN SMALL LETTER G and
> U+0261 LATIN SMALL LETTER SCRIPT G.  A Unicode-compliant font for a
> children's book may render U+0061 and U+0067 like the reference glyphs for
U+0251 and U+0261; it may even render each pair identically.
It is not proper to the Mongolian. It is not the proper objection points to
my opinion on Mongolian.
In the Mongolian, there are only use different font, no different characters
exist in first year pupil text book. 

> I trust the following (points 2 to 5) are guiding principles for dealing
with overlooked or definitely unclear combinations.  
> Unicode might not take kindly to changing the existing assignments. 
Thanks to your understanding. Maybe other person have some other opinion on
the points 2-5.
I would like to hear from all members. 
It is Ok to me that the principle of the Mongolian Variant form mapping
might be quietly different with my list.
But I am hoping that there should be one this kind of principle. 
Do you know, we are facing one big problem in Inner Mongolia that we have to
change some current existing Mongolian grammar in primary to secondary
school education system,
Because of the some Unicode Mongolian Variant mapping definition. 
Do you agree that because of the Unicode Mongolian Encoding rule definition,
the users have to change their learned grammar to fit the Unicode rule ?
Or Unicode rule need to fit with the majority people's existing grammar
knowledge ? 
If you need detailed information on it, I can prepare it in the following
discussion.

> How much of the problem is due to unclear determination of whether the
starting point is the isolated, initial, medial or final form?
> There may conceivably be an error in the 'joining type' of MVS and NNBSP.

> As far as the variation selectors are concerned, the Unicode standard
rules that the preceding letter is final or isolated, 
> and the following letter is initial or isolated.  Apart from any issues
there, the definitions should be clear. 
> I looked and saw no difference for the Mongolian script between
StandardizedVariants.html in Versions 4.00 and 8.00 of Unicode.
StandardizedVariants.html is the only small part of the mapping rule. 
And the NP in https://r12a.github.io/scripts/mongolian/variants not covered
whole possibility yet. 
This is why we have a discussion here. 
The MVS and NNBSP is the only starting point. But it was the most
problematic points in Mongolian before.

For example, I have amending points on the first letter U1820-A. I am not
sure all of the member agree me, but exactly I had the requirements from
users. 
I will send the U1800-A related inputs in separate mail.

Thanks and Regards,

Jirimutu
==========================================================
Almas Inc.
101-0021 601 Nitto-Bldg, 6-15-11, Soto-Kanda, Chiyoda-ku, Tokyo
E-Mail: jrmt@almas.co.jp   Mobile : 090-6174-6115
Phone : 03-5688-2081,   Fax : 03-5688-2082
http://www.almas.co.jp/   http://www.compiere-japan.com/
==========================================================




-----Original Message-----
From: Richard Wordingham [mailto:richard.wordingham@ntlworld.com] 
Sent: Monday, August 3, 2015 3:14 AM
To: public-i18n-mongolian@w3.org
Subject: Re: New Thread - FVS Assignment MisMatch

On Mon, 3 Aug 2015 00:56:29 +0900
<jrmt@almas.co.jp> wrote:


> For example, following is my personal consideration. 
> 
> 1. We select the most commonly used isolate, initial, medial, final 
> form of the character as the default Variant form (No need FVS1-3).
> 
>    The variant form listed on the primary school first year pupil's 
> text book comes first (is the default form).

>    The default isolate form have to be same with the Unicode encoding 
> chart.

No!  The Unicode editing committee tried to chose a form that was unique to
a particular character.  That does not make it the appropriate default
isolate.  Remember, the basic character charts are not normative; they
merely serve to tell the reader which character has a particular code.  This
can fail spectacularly when characters are distinguished by their sound
rather than their shapes.  (There are also a few Korean Chinese
compatibility characters that are principally distinguished by sound.)

A similar example is the pairs U+0061 LATIN SMALL LETTER A and
U+0251 LATIN SMALL LETTER ALPHA and U+0067 LATIN SMALL LETTER G and
U+0261 LATIN SMALL LETTER SCRIPT G.  A Unicode-compliant font for a
children's book may render U+0061 and U+0067 like the reference glyphs for
U+0251 and U+0261; it may even render each pair identically.

I trust the following (points 2 to 5) are guiding principles for dealing
with overlooked or definitely unclear combinations.  Unicode might not take
kindly to changing the existing assignments. 

> 2. To exactly specify the second regularly used variant form, we will 
> use FVS1.
<snip>

> Because of the previous existing Mongolian Variant formatting rule 
> have not clearly, uniquely defined the form selection.

How much of the problem is due to unclear determination of whether the
starting point is the isolated, initial, medial or final form?
There may conceivably be an error in the 'joining type' of MVS and NNBSP.
As far as the variation selectors are concerned, the Unicode standard rules
that the preceding letter is final or isolated, and the following letter is
initial or isolated.  Apart from any issues there, the definitions should be
clear. I looked and saw no difference for the Mongolian script between
StandardizedVariants.html in Versions
4.00 and 8.00 of Unicode.

Richard.
Received on Sunday, 2 August 2015 20:52:28 UTC