RE: FVS Assignment MisMatch Winding Down

Yes, agreed we are not redesigning the encoding.



As Richard Ishida put it in the beginning “we are working on a page that records observable differences between, on the one hand, specs that indicate what forms the Unicode FVS characters should produce in Mongolian, and on the other hand, what major fonts actually produce with the FVS characters.” The objective will be a document, such as DS01, that shows all Unicode-approved FVS sequences. Our objective is not to start all over.



On the issue of spelling ABU “father” as mentioned below, there is no question in the Cyrillic as to the proper spelling. I do not have the experience in Inner Mongolia why there is such a variety of opinions. I will leave spelling differences out of the discussion unless it is vitally important to the argument.



Greg




From: jrmt@almas.co.jp [mailto:jrmt@almas.co.jp]
Sent: Friday, August 7, 2015 5:32 AM
To: Greg Eck <greck@postone.net>; public-i18n-mongolian@w3.org
Subject: RE: FVS Assignment MisMatch Winding Down

Hi Greg,

> I am responding below in yellow.
> Thank you for not giving me 100 words as I asked for ☺.
> Sorry, I am not so sympathetic about the spelling problems.
> Every school lad in every country has to learn to spell correctly.
> While it is true that it takes some work to check a Mongolian document created in digital form, the first step is to fix what is going through the pupil’s mind.
> I had to learn the rules of when to place the I before the E and vice versa.
> Every German boy has to learn his declensions.
> The laws of vowel harmony are very strict and very consistent – there is breakage, but rarely.
> Mongol boys and girls must learn the rules of vowel harmony – there is no way of getting around it.
> Kids are smart and they will learn these things. How Chinese kids can learn a 10,000 character set by age 15 is harder for me to fathom!
> I don’t find it hard to see which words are correct and which are incorrect – but I do find it hard to tell which letter is actually lurking behind the black ink.
> Smart keyboards and other utilities will help that over time.

Actually, I do not willing to continue to discuss on this points.
I have already talked in my another mail. I had done my effort to strong objection to this proposal preparation stage,
Because of these problem. But today, I have to accept this Encoding and work on it.
We cannot re-design the Encoding, it will lead another one or decade for Mongolian.
My effort is not on how to redesign the Encoding, but on how to optimize it on all kind of utilization.
We can work around it and do more intelligent way to solve this problem.
But we need more times to get these problem solved by working around solution.
Our solution on it will be folding all of these possibility as Richard Wordingham suggested.
What it is mean, we will handle the O and U, OE and UE as the same code in reality,
it is mean, we will omit one of each pair in the real word handling, and look them as same,
even they are exist on the code table and exist in the context.

(U1820+U182A+1823) - wrong

But I suggest you google it on the internet. There is a lot hit on it means, the writer did not think it is the wrong word.
ᠠᠪᠣ ᠲᠡᠢ ᠪᠠᠨ ᠠᠳ᠋ᠡᠯᠢ᠍ᠬᠠᠨ ᠬᠦᠷᠬᠡᠨ ᠳ᠋ᠥ ...<http://www.burgud.com/blog/MyView.aspx?burgudID=10011&ArticleID=DCFC40FE3CBC627D8FF18B6ED4965EA1>
www.burgud.com/blog/MyView.aspx?burgudID<http://www.burgud.com/blog/MyView.aspx?burgudID>...
ᠠᠪᠣ ᠳᠠᠭᠠᠨ ᠪᠢ ᠬᠠᠢ᠌ᠷᠡᠳ᠋ᠡᠢ ᠃ ᠮᠢ᠍ᠨᠤ ᠠᠪᠣ ᠣᠯᠡᠨ ᠰᠠᠢ᠌ᠬᠠᠨ ᠦᠯᠢ᠍ᠬᠡᠷ ᠮᠡᠳ᠋ᠡᠳ᠋ᠡᠭ᠌ ᠃ ᠪᠠᠭ᠎ᠠ ᠳᠠᠭᠠᠨ ᠠᠪᠣ ᠶᠢᠨ ᠶᠢᠡᠨ ᠡᠪᠣᠷ ᠲᠣ ...
ᠠᠪᠣ ᠬᠦᠮᠦᠨ ᠢ ᠦᠷᠦ ᠡᠪᠡᠳᠴᠦ ᠶᠠᠪᠣᠭᠠᠷᠠᠢ<http://www.burgud.com/blog/MyView.aspx?burgudID=10011&ArticleID=FC894945CD1DA27985F37764762E9D18>
www.burgud.com/blog/MyView.aspx?burgudID<http://www.burgud.com/blog/MyView.aspx?burgudID>...
2015/01/28 - ᠪᠦᠷᠭᠡ ᠶᠢᠨ ᠡᠵᠡᠨ᠄ ᠣᠷᠠᠨᠰᠠᠨᠠᠭ᠎ᠠ ᠳᠡᠰ᠄ 27 ᠬᠤᠪᠢ᠄ 17875 ᠪᠦᠷᠭᠡ ᠶᠢᠨ ᠲᠡᠮᠳᠡᠭᠯᠡᠯ᠄ 395 ᠨᠤᠲᠤᠭ᠄ ᠦᠪᠦᠷ ᠮᠤᠩᠭ᠋ᠤᠯ ...
ᠨᠣᠳ᠋ᠣᠭ ᠠᠮᠢ᠍ᠳ᠋ᠠᠢ ᠠᠪᠣ<http://burgud.com/blog/MyView.aspx?burgudID=10194&ArticleID=888B8F57E410C402CC42D81F6610A25D>
burgud.com/blog/MyView.aspx?burgudID=10194...
ᠨᠣᠳ᠋ᠣᠭ ᠠᠮᠢ᠍ᠳ᠋ᠠᠢ ᠠᠪᠣ. ᠲᠣᠷᠵᠢ᠍ᠫᠠᠯᠮ᠎ᠠ ᠶᠢᠨ ᠰᠣᠮᠢ᠍ᠶ᠎ᠠ. ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠭᠣᠷᠪᠠᠨ ᠣᠯᠠᠭᠠᠨ ᠮᠢ᠍ᠨᠢ. ᠲᠠᠷᠠᠵᠦ ᠨᠢ᠍ᠭᠡ ᠂ ᠬᠣᠷᠠᠵᠦ ᠨᠢ᠍ᠭᠡ ...
2-1 Аав ээжийн минь сургаал: ᠠᠪᠣ ᠡᠵᠢ ᠵᠢᠨ ᠮᠢᠨᠢ ...<http://www.cjvlang.com/mongol/2-01.html>
www.cjvlang.com/mongol/2-01.html<http://www.cjvlang.com/mongol/2-01.html>
ᠠᠪᠣ ᠡᠵᠢ ᠵᠢᠨ ᠮᠢᠨᠢ ᠰᠣᠷᠭᠠᠯ. ᠣᠨᠠᠭᠰᠠᠨ ᠰᠢᠷᠣᠢ ᠪᠣᠯ ᠠᠯᠲᠠ ᠶᠣᠮ ᠭᠡᠵᠦ ᠠᠪᠣ ᠮᠢᠨᠢ ᠰᠣᠷᠭᠠᠳᠠᠭ ᠰᠠᠨ᠃ ᠣᠣᠭᠣᠭᠰᠠᠨ ᠣᠰᠣ ᠪᠣᠯ ᠷᠠᠰᠢᠶᠠᠨ ᠶᠣᠮ ...
ᠠᠪᠣ ᠠᠭᠣᠯᠠ ᠮᠢᠨᠢ<http://mo.ttcy.com/song_detailed.aspx?ID=22795>
mo.ttcy.com/song_detailed.aspx?ID=22795
ᠲᠠᠭᠣᠣ ᠶᠢᠨ ᠲᠠᠨᠢᠯᠴᠣᠭᠣᠯᠭ᠎ᠠ. ᠨᠡᠷ᠎ᠡ ᠄ᠠᠪᠣ ᠠᠭᠣᠯᠠ ᠮᠢᠨᠢ; ᠣᠷᠠᠯᠢᠭᠴᠢᠨ ᠄ᠪᠤᠷᠮ᠎ᠠ; ᠢᠷᠠᠭᠤ ᠨᠠᠶᠢᠷᠠᠭᠴᠢ ᠄ᠦᠭᠡᠢ; ᠬᠦᠭᠵᠢᠮ ‍ᠣᠨ ...
ᠠᠴᠢᠯᠠᠯ ᠤᠨ ᠲᠡᠭᠷᠢ ᠠᠪᠣ ᠮᠢᠨᠢ<http://mo.ttcy.com/song_detailed.aspx?ID=21555>
mo.ttcy.com/song_detailed.aspx?ID=21555
ᠠᠴᠢᠯᠠᠯ ᠤᠨ ᠲᠡᠭᠷᠢ ᠠᠪᠣ ᠮᠢᠨᠢ. ᠲᠠᠭᠣᠣ ᠶᠢᠨ ᠲᠠᠨᠢᠯᠴᠣᠭᠣᠯᠭ᠎ᠠ. ᠨᠡᠷ ᠡ ᠄ᠠᠴᠢᠯᠠᠯ ᠤᠨ ᠲᠡᠭᠷᠢ ᠠᠪᠣ ᠮᠢᠨᠢ; ᠣᠷᠠᠯᠢᠭᠴᠢᠨ ᠄ ᠤᠶᠤᠨᠰᠠᠩ; ᠢᠷᠠᠭᠤ ...
ᠡᠵᠢ ᠠᠪᠣ ᠪᠠᠷ ᠢᠶᠡᠨ ᠥᠨᠥᠰᠥᠬᠥᠯᠥᠨ ᠡ ᠳ᠋ᠧ<http://mo.ttcy.com/song_detailed.aspx?ID=21357>
mo.ttcy.com/song_detailed.aspx?ID=21357
ᠲᠠᠭᠣᠣ ᠶᠢᠨ ᠲᠠᠨᠢᠯᠴᠣᠭᠣᠯᠭ᠎ᠠ. ᠨᠡᠷ᠎ᠡ ᠄ᠡᠵᠢ ᠠᠪᠣ ᠪᠠᠷ ᠢᠶᠡᠨ ᠥᠨᠥᠰᠥᠬᠥᠯᠥᠨ ᠡ ᠳ᠋ᠧ; ᠣᠷᠠᠯᠢᠭᠴᠢᠨ ᠄ ᠠᠯᠢᠮ ᠠ; ᠢᠷᠠᠭᠤ ᠨᠠᠶᠢᠷᠠᠭᠴᠢ ᠄ᠦᠭᠡᠢ ...
comments on khorchin mongolian folk song jinliang - YouTube<https://www.youtube.com/all_comments?v=DfXKgmhv9CM>
https://www.youtube.com/all_comments?v...

ᠠᠯᠲᠠᠨ ᠪᠠᠭᠣᠣ ᠭᠠᠷ ᠳ᠋ᠦ ᠴᠢᠨᠢ ᠠᠪᠣ ᠵᠢᠨ ᠴᠢᠨᠢ ᠰᠣᠷᠭᠠᠯ ᠴᠡᠭᠡᠵᠢᠨ ᠳ᠋ᠦ ᠴᠢᠨᠢ ᠠᠰᠬᠠᠷᠠᠭᠠᠭᠣᠯᠣᠭᠠᠳ ᠣᠬᠢᠯᠠᠭᠠᠳ ᠶᠠᠭᠣ ᠪᠡᠨ ᠬᠢᠨ᠎ᠡ ᠳ᠋ᠡ ᠠ ...
ᠠᠪᠣ ᠢᠢ ᠦᠰᠭᠡᠭᠰᠡᠨ ᠡᠮᠡᠭᠡ ᠪᠠᠨ ᠪᠢ. ᠠᠪᠣᠷᠠᠯ ᠤᠨ ᠲᠡᠭᠷᠢ ᠭᠡᠵᠦ ᠢᠲᠡᠭᠡᠳᠡᠭ ᠳ᠋ᠠ. ᠡᠵᠢ ᠢᠢ ᠲᠤᠷᠨᠢᠭᠤᠯᠤᠭᠰᠠᠨ ᠡᠮᠡᠭᠡ ᠪᠠᠨ ᠪᠢ. ᠡᠨᠡᠷᠢᠯ ᠤᠨ ...
整合的学校,堪忧的母语<http://www.im-pg.com/jAlmas/public/content/eph/eph-mn/zhxxkymy.html>
www.im-pg.com/jAlmas/public/.../zhxxkymy.html<http://www.im-pg.com/jAlmas/public/.../zhxxkymy.html>
... ᠣ᠋ᠨ ᠬᠥᠴᠢᠷ ᠢ ᠮᠠᠳᠠᠬᠥ ᠣᠢᠬᠡᠢ᠂ ᠠᠪᠣ ᠡᠵᠢ ᠡᠴᠨ ᠪᠠᠨ ᠰᠠᠯᠤᠬᠰᠠᠨ ᠣᠴᠢᠷ ... ᠣᠢᠵᠠᠯ ᠰᠠᠨᠠᠭ᠎ᠠ᠂ ᠶᠤᠰᠤ ᠮᠤᠷᠠᠯ ᠣ᠋ᠨ ᠬᠢᠴᠢᠶᠠᠯ ᠢ ᠠᠪᠣ ᠡᠵᠢ᠂ ...

> But let’s come back to our initial objective.
> We have a difficult script.
> We have an encoding that is has problems, but is work-able. I would say admirable given the state of the digital world that it came from.
> Many great people have worked on it … including you!
> We have some differences in how we as font developers have implemented things.
> Some decisions have been made to optimize for efficiency. Some for linguistic soundness.
> Our task is not to re-design the whole – but to fix what is broken. And get back to using it.
> Our task at hand – our first task is to - as a unified front, come up with a set of
> Mongolian code-points mapped to positions isolate, initial, medial, final which are again mapped to variations selectors none, first, second and third and hopefully not fourth.
I agree with you.


>  Can we deal with the medial GA position?
> Do we have overload there?
> Can we handle all of the variants with only 4 slots? With the final dotted form still at the medial slot instead of the final?
> I count five variants with the final dotted form at the medial. We have room for four by my count.

Can we discuss the Mongolian GA and QA in different threads ?,
Our teams intention is not to increase the variant count, but decrease the variant count.
It will helpful to decrease possibilities of the problematic indistinct encoding which we discussed in the beginning of this mail.

I will come back to GA and QA in another threads.

Thanks and Best Regards,


Jirimutu
==========================================================
Almas Inc.
101-0021 601 Nitto-Bldg, 6-15-11, Soto-Kanda, Chiyoda-ku, Tokyo
E-Mail: jrmt@almas.co.jp<mailto:jrmt@almas.co.jp>   Mobile : 090-6174-6115
Phone : 03-5688-2081,   Fax : 03-5688-2082
http://www.almas.co.jp/   http://www.compiere-japan.com/

==========================================================



From: Greg Eck [mailto:greck@postone.net]
Sent: Friday, August 7, 2015 3:26 AM
To: jrmt@almas.co.jp<mailto:jrmt@almas.co.jp>; public-i18n-mongolian@w3.org<mailto:public-i18n-mongolian@w3.org>
Subject: FVS Assignment MisMatch Winding Down

Jirimutu,

I am responding below in yellow.
Thank you for not giving me 100 words as I asked for ☺.
Sorry, I am not so sympathetic about the spelling problems.
Every school lad in every country has to learn to spell correctly.
While it is true that it takes some work to check a Mongolian document created in digital form, the first step is to fix what is going through the pupil’s mind.
I had to learn the rules of when to place the I before the E and vice versa.
Every German boy has to learn his declensions.
The laws of vowel harmony are very strict and very consistent – there is breakage, but rarely.
Mongol boys and girls must learn the rules of vowel harmony – there is no way of getting around it.
Kids are smart and they will learn these things. How Chinese kids can learn a 10,000 character set by age 15 is harder for me to fathom!
I don’t find it hard to see which words are correct and which are incorrect – but I do find it hard to tell which letter is actually lurking behind the black ink.
Smart keyboards and other utilities will help that over time.



But let’s come back to our initial objective.
We have a difficult script.
We have an encoding that is has problems, but is work-able. I would say admirable given the state of the digital world that it came from.
Many great people have worked on it … including you!
We have some differences in how we as font developers have implemented things.
Some decisions have been made to optimize for efficiency. Some for linguistic soundness.
Our task is not to re-design the whole – but to fix what is broken. And get back to using it.
Our task at hand – our first task is to - as a unified front, come up with a set of Mongolian code-points mapped to positions isolate, initial, medial, final which are again mapped to variations selectors none, first, second and third and hopefully not fourth.

Can we deal with the medial GA position?
Do we have overload there?
Can we handle all of the variants with only 4 slots? With the final dotted form still at the medial slot instead of the final?
I count five variants with the final dotted form at the medial. We have room for four by my count.

Greg


-----Original Message-----
From: jrmt@almas.co.jp<mailto:jrmt@almas.co.jp> [mailto:jrmt@almas.co.jp]
Sent: Thursday, August 6, 2015 6:49 PM
To: Greg Eck <greck@postone.net<mailto:greck@postone.net>>; 'Richard Wordingham' <richard.wordingham@ntlworld.com<mailto:richard.wordingham@ntlworld.com>>; public-i18n-mongolian@w3.org<mailto:public-i18n-mongolian@w3.org>
Subject: RE: New Thread - FVS Assignment MisMatch

Hi Greg,

> This is a very interesting discussion.
> I am behind in my reading of the discussion, so forgive me if you have already dealt with this.
> I am concerned with such a high figure of 80% having multiple spelling possibilities.
> My dictionary+grammar does not show this.
> I wonder if you could pull 100 words from your dictionary and mark the ones with the possibility of multiple spellings.
> It would be good if you could include the text also, not just images, so that we can see the actual code-points behind the displayed forms.
> That would help clarify the exact issue.
> Are you talking about stems only OR inflected forms OR the both of them?

I am talking about the printed word to code point mapping possibility.
That is mean the word displayed on the screen or printed on the paper, It is maybe stored in different encoding in Unicode (I am calling it spelling).
Maybe you already know what I am talking. Let me list 10 daily used word here with possibility.
If you really need 100 words, I will prepare for you from dictionary's particular page.

ᠠᠪᠤ ( father) - (U1820+U182A+1824), correct
(U1820+U182A+1823) - wrong
, maybe there are (U1820+U182A+1825), - wrong
(U1820+U182A+1826) - wrong

ᠡᠵᠢ ( mother ) - (U1821+U1835+1822), - correct
(U1821+U1835+1836) - wrong

ᠠᠬ ᠠ ( brother) - (U1820+182C+180E+1820), - correct
(U1820+182C+180D+180E+1820),
Even (U1820+182C+180E+1821),
(U1820+182C+180D+180E+1821)

ᠳᠡᠭᠦᠦ (sister) - (U1833+U1821+U182D+U1826+U1826), - correct
(U1833+U1821+U182D+U1826+U1825), - wrong
(U1833+U1821+U182D+U1825+U1825), - wrong
(U1833+U1821+U182D+U185+U1826), - wrong
(U1832+U1821+U182D+U1826+U1826), - wrong
(U1832+U1821+U182D+U1826+U1825), - wrong
(U1832+U1821+U182D+U1825+U1825), - wrong
(U1832+U1821+U182D+U185+U1826). - wrong
I have not include the final U1823, U1824 possibility and the U1820 possibility for this word.
Etc. etc.

ᠬᠦᠦ (son) - (U182C+U1826+U1826),
(U182C+U1825+U1826),
(U182C+U1826+U1825),
(U182C+U1825+U1825),
(U182D+U1826+U1826),
(U182D+U1825+U1826),
(U182D+U1826+U1825),
(U182D+U1825+U1825).

ᠦᠬᠢᠨ (daughter) - (U1826+U182C+U1822+U1828),
(U1825+U182C+U1822+U1828),
(U1826+U182D+U1822+U1828),
(U1825+U182D+U1822+U1828).
I have not included the final N's U1820 and U1821 posibility.

ᠮᠢᠨᠤ ( my ) - (U182E+U1822+U1828+U1824),
(U182E+U1822+U1828+U1823),
(U182E+U1822+U1828+U1825),
(U182E+U1822+U1828+U1826)

ᠲᠠᠨᠠᠢ ( his ) - (U1832+U1820+U1828+U1820+U1822),
(U1832+U1821+U1828+U1821+U1822),
(U1833+U1820+U1828+U1820+U1822),
(U1833+U1821+U1828+U1821+U1822).
I have not include the wrong spelled possibility. The first two is all correct spelling, and have different meanings.

ᠭᠡᠷ ( home ) - (U182D+U1821+U1837), (U182C+U1821+U1837)
ᠨᠤᠳᠤᠭ ( hometown ) - (U1828+U1824+U1833+U1824+U182D),
(U1828+U1824+U1833+U1823+U182D),
(U1828+U1823+U1833+U1824+U182D),
(U1828+U1823+U1833+U1823+U182D),
(U1828+U1824+U1832+U1824+U182D),
(U1828+U1824+U1832+U1823+U182D),
(U1828+U1823+U1832+U1824+U182D),
(U1828+U1823+U1832+U1823+U182D)

I think the members who could not read Mongolian all have been puzzled by this.
In the list, Even the Mongolian people will confuse following word's correct spelling or encoding.
ᠳᠡᠭᠦᠦ
ᠦᠬᠢᠨ
ᠮᠢᠨᠤ
ᠲᠠᠨᠠᠢ
ᠨᠤᠳᠤᠭ

We need dictionary to confirm which is OE or UE, U or UE etc., when we need to encode or spell it correctly.

Regards,

Jirimutu
==========================================================
Almas Inc.
101-0021 601 Nitto-Bldg, 6-15-11, Soto-Kanda, Chiyoda-ku, Tokyo
E-Mail: jrmt@almas.co.jp<mailto:jrmt@almas.co.jp>   Mobile : 090-6174-6115
Phone : 03-5688-2081,   Fax : 03-5688-2082
http://www.almas.co.jp/   http://www.compiere-japan.com/

==========================================================

Received on Friday, 7 August 2015 13:33:03 UTC