W3C home > Mailing lists > Public > public-i18n-mongolian@w3.org > July to September 2015

FVS Assignment MisMatch Winding Down

From: Greg Eck <greck@postone.net>
Date: Thu, 6 Aug 2015 18:26:21 +0000
To: "jrmt@almas.co.jp" <jrmt@almas.co.jp>, "public-i18n-mongolian@w3.org" <public-i18n-mongolian@w3.org>
Message-ID: <BN3PR10MB03211E101454114A9A66B62BAF740@BN3PR10MB0321.namprd10.prod.outlook.com>
Jirimutu,

I am responding below in yellow.
Thank you for not giving me 100 words as I asked for ☺.
Sorry, I am not so sympathetic about the spelling problems.
Every school lad in every country has to learn to spell correctly.
While it is true that it takes some work to check a Mongolian document created in digital form, the first step is to fix what is going through the pupil’s mind.
I had to learn the rules of when to place the I before the E and vice versa.
Every German boy has to learn his declensions.
The laws of vowel harmony are very strict and very consistent – there is breakage, but rarely.
Mongol boys and girls must learn the rules of vowel harmony – there is no way of getting around it.
Kids are smart and they will learn these things. How Chinese kids can learn a 10,000 character set by age 15 is harder for me to fathom!
I don’t find it hard to see which words are correct and which are incorrect – but I do find it hard to tell which letter is actually lurking behind the black ink.
Smart keyboards and other utilities will help that over time.



But let’s come back to our initial objective.
We have a difficult script.
We have an encoding that is has problems, but is work-able. I would say admirable given the state of the digital world that it came from.
Many great people have worked on it … including you!
We have some differences in how we as font developers have implemented things.
Some decisions have been made to optimize for efficiency. Some for linguistic soundness.
Our task is not to re-design the whole – but to fix what is broken. And get back to using it.
Our task at hand – our first task is to - as a unified front, come up with a set of Mongolian code-points mapped to positions isolate, initial, medial, final which are again mapped to variations selectors none, first, second and third and hopefully not fourth.

Can we deal with the medial GA position?
Do we have overload there?
Can we handle all of the variants with only 4 slots? With the final dotted form still at the medial slot instead of the final?
I count five variants with the final dotted form at the medial. We have room for four by my count.

Greg


-----Original Message-----
From: jrmt@almas.co.jp [mailto:jrmt@almas.co.jp]
Sent: Thursday, August 6, 2015 6:49 PM
To: Greg Eck <greck@postone.net>; 'Richard Wordingham' <richard.wordingham@ntlworld.com>; public-i18n-mongolian@w3.org
Subject: RE: New Thread - FVS Assignment MisMatch

Hi Greg,

> This is a very interesting discussion.
> I am behind in my reading of the discussion, so forgive me if you have already dealt with this.
> I am concerned with such a high figure of 80% having multiple spelling possibilities.
> My dictionary+grammar does not show this.
> I wonder if you could pull 100 words from your dictionary and mark the ones with the possibility of multiple spellings.
> It would be good if you could include the text also, not just images, so that we can see the actual code-points behind the displayed forms.
> That would help clarify the exact issue.
> Are you talking about stems only OR inflected forms OR the both of them?

I am talking about the printed word to code point mapping possibility.
That is mean the word displayed on the screen or printed on the paper, It is maybe stored in different encoding in Unicode (I am calling it spelling).
Maybe you already know what I am talking. Let me list 10 daily used word here with possibility.
If you really need 100 words, I will prepare for you from dictionary's particular page.

ᠠᠪᠤ ( father) - (U1820+U182A+1824), correct
(U1820+U182A+1823) - wrong
, maybe there are (U1820+U182A+1825), - wrong
(U1820+U182A+1826) - wrong

ᠡᠵᠢ ( mother ) - (U1821+U1835+1822), - correct
(U1821+U1835+1836) - wrong

ᠠᠬ᠎ᠠ ( brother) - (U1820+182C+180E+1820), - correct
(U1820+182C+180D+180E+1820),
Even (U1820+182C+180E+1821),
(U1820+182C+180D+180E+1821)

ᠳᠡᠭᠦᠦ (sister) - (U1833+U1821+U182D+U1826+U1826), - correct
(U1833+U1821+U182D+U1826+U1825), - wrong
(U1833+U1821+U182D+U1825+U1825), - wrong
(U1833+U1821+U182D+U185+U1826), - wrong
(U1832+U1821+U182D+U1826+U1826), - wrong
(U1832+U1821+U182D+U1826+U1825), - wrong
(U1832+U1821+U182D+U1825+U1825), - wrong
(U1832+U1821+U182D+U185+U1826). - wrong
I have not include the final U1823, U1824 possibility and the U1820 possibility for this word.
Etc. etc.

ᠬᠦᠦ (son) - (U182C+U1826+U1826),
(U182C+U1825+U1826),
(U182C+U1826+U1825),
(U182C+U1825+U1825),
(U182D+U1826+U1826),
(U182D+U1825+U1826),
(U182D+U1826+U1825),
(U182D+U1825+U1825).

ᠦᠬᠢᠨ (daughter) - (U1826+U182C+U1822+U1828),
(U1825+U182C+U1822+U1828),
(U1826+U182D+U1822+U1828),
(U1825+U182D+U1822+U1828).
I have not included the final N's U1820 and U1821 posibility.

ᠮᠢᠨᠤ ( my ) - (U182E+U1822+U1828+U1824),
(U182E+U1822+U1828+U1823),
(U182E+U1822+U1828+U1825),
(U182E+U1822+U1828+U1826)

ᠲᠠᠨᠠᠢ ( his ) - (U1832+U1820+U1828+U1820+U1822),
(U1832+U1821+U1828+U1821+U1822),
(U1833+U1820+U1828+U1820+U1822),
(U1833+U1821+U1828+U1821+U1822).
I have not include the wrong spelled possibility. The first two is all correct spelling, and have different meanings.

ᠭᠡᠷ ( home ) - (U182D+U1821+U1837), (U182C+U1821+U1837)
ᠨᠤᠳᠤᠭ ( hometown ) - (U1828+U1824+U1833+U1824+U182D),
(U1828+U1824+U1833+U1823+U182D),
(U1828+U1823+U1833+U1824+U182D),
(U1828+U1823+U1833+U1823+U182D),
(U1828+U1824+U1832+U1824+U182D),
(U1828+U1824+U1832+U1823+U182D),
(U1828+U1823+U1832+U1824+U182D),
(U1828+U1823+U1832+U1823+U182D)

I think the members who could not read Mongolian all have been puzzled by this.
In the list, Even the Mongolian people will confuse following word's correct spelling or encoding.
ᠳᠡᠭᠦᠦ
ᠦᠬᠢᠨ
ᠮᠢᠨᠤ
ᠲᠠᠨᠠᠢ
ᠨᠤᠳᠤᠭ

We need dictionary to confirm which is OE or UE, U or UE etc., when we need to encode or spell it correctly.

Regards,

Jirimutu
==========================================================
Almas Inc.
101-0021 601 Nitto-Bldg, 6-15-11, Soto-Kanda, Chiyoda-ku, Tokyo
E-Mail: jrmt@almas.co.jp<mailto:jrmt@almas.co.jp>   Mobile : 090-6174-6115
Phone : 03-5688-2081,   Fax : 03-5688-2082
http://www.almas.co.jp/   http://www.compiere-japan.com/

==========================================================




Received on Thursday, 6 August 2015 18:26:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:07:05 UTC