Re: Issues with DA,NA,GA default medial variants

Hi Andrew,
I suggested not only due to frequency but vowel before forms are correct 
and logical. If you don't agree with it, I will argument it more deeply. 
The current implementation has simply too many FVSs in written text. If 
you don't accept that much more typing FVSs is annoying (in present and 
future), then what do you think about waste of storage? Probably you 
could argument storage is not expensive. Actually, it's not. Let's 
consider you write a spellchecker software for Mongolian. You know, 
Mongolian language is very agglutinative. From a database with 50 000 
stem entries generated more than 45 Million words without common proper 
Names. How long wait you to check a document with hundred pages? How do 
you think memory usage? We already experienced it and understood the 
every bit very important is! We have rewritten our spell checker 
software three times from Java to C via C++ to bring usable for end 
users with commercial level of quality.

Badral

On 26.10.2015 07:58, Andrew West wrote:
> Hi Michel,
>
> In the cases discussed by Badral in this thread nothing is broken; he
> would just prefer to swap the default and FVS1 glyphs because he says
> the FVS1 forms are more frequent than the default forms.  I would
> strongly prefer not to change FVS assignments if there is not actually
> a problem that needs fixing, as the costs would outweigh the benefits,
> but if this group feels otherwise it can always propose such FVS
> changes and see what the committees think.
>
> Andrew
>
>
>
> On 26 October 2015 at 00:43, Michel Suignard <michel@suignard.com> wrote:
>> Andrew and Badral
>>
>> I think there is a large majority that thinks that the best thing we can do
>> is to correct all the imprecision in the Unicode block description and the
>> FVS list as soon as possible. I doubt there is too much existing data using
>> current FVS because of its inherent stability between the various
>> implementation. For many UTC there have been various communication about the
>> various defects of the current FVS sequences and the fact that a new version
>> was forthcoming. I had even an action on that specific issues for many UTCs
>> that I left undone for lack of time and frankly expertise (but I knew that
>> the FVS described in Unicode and 10646 were lacking). In fact, I have been
>> working at a new version for the code chart for a long time and now with
>> this ongoing discussion it is my hope that we will have a consensus in time
>> for Unicode 9.0. I have done some work using the various versions of the
>> DS01 document created by Greg Eck and am planning to release a new version
>> of the Mongolian code charts as soon as a reasonable consensus is achieved
>> in this list. Hopefully sometimes in November.
>>
>>
>>
>> In other words, I am convinced that UTC is very receptive at fixing the FVS
>> situation, not the other way around. The sooner we fix it, less damage. WG2
>> is another matter although again I think more people will also be in favor
>> of fixing broken things there as well (traditionally ISO is way less
>> concerned about stability than Unicode is).
>>
>>
>>
>> Michel
>>
>> From: Badral S. [mailto:badral@bolorsoft.com]
>> Sent: Sunday, October 25, 2015 1:02 PM
>> To: public-i18n-mongolian@w3.org
>> Subject: Re: Issues with DA,NA,GA default medial variants
>>
>>
>>
>> Hi Andrew,
>> I didn't say that there doesn't exist Unicode-encoded Mongolian data or
>> websites. Certainly, there exist significant number of data. Bolorsoft also
>> creates Mongolian data or web sites with Unicode. I just mentioned these are
>> already unstable and not large in comparison with custom encoded mongolian
>> data.
>> My question was incorrect due to my poor English. Actually, I should write
>> "Why we should not vote correct variants of Da, Na, Ga as default?" Because,
>> we never defined current default variants. Every font developer has
>> implemented his fonts with own perspectives.
>> For instance, Mongolianscript (since 2000) and Noto sans fonts implemented
>> before vowel variants of Na, Ga, Da as default medial form always. However,
>> Mongolian Baiti or Mongolian White fonts have before consonant forms as
>> default. If I understand correct, now we want to harmonize such diverse
>> variants? If yes, why we should select incorrect variants?
>> Is destabilization of already unstable data is more significant or
>> future-oriented, correct and effective variant is more significant?
>> If possible, could we just ask from the committees?
>>
>> PS: I want to note, why I speak more about correctness or future-oriented
>> solution. Because, Mongolian Language law
>> (http://www.parliament.mn/laws?key=%D0%BC#2543) has been adopted by
>> Mongolian parliament. Now, the usage of Mongolian script increased in
>> Mongolia. By 2025 will be all state or governmental organizations are
>> conduct their correspondence and public affair in both Mongolian script and
>> cyrillic.
>>
>> Badral
>>
>> On 25.10.2015 19:44, Andrew West wrote:
>>
>> Hi Badral,
>>
>>
>>
>> There are still a significant number of websites using Unicode-encoded
>>
>> Mongolian, and an unknown amount of Unicode Mongolian data that is not
>>
>> online, and changing the meaning of any FVS will have a negative
>>
>> impact on and a cost to people maintaining Unicode Mongolian data and
>>
>> websites.   I do not speak for the UTC or WG2, but I think it is
>>
>> highly unlikely that these committees would agree to switch any FVS
>>
>> definition without a very compelling reason.
>>
>>
>>
>> Andrew
>>
>>
>>
>>
>>
>>
>>
>> On 25 October 2015 at 16:47, Badral S. <badral@bolorsoft.com> wrote:
>>
>> Hi Andrew & Greg,
>>
>> I think the impact is slight because:
>>
>> 1. Most existing Mongolian data has still own encoding (non-unicode). In
>>
>> Mongolia, mostly used the fonts CM Urga, Ulaanbaatar etc. For instance:
>>
>> http://www.president.mn/mng, http://khumuunbichig.montsame.mn ...
>>
>> In inner Mongolia used mostly Menkhsoft's solution. Please comment
>>
>> Menksoft's representatives.
>>
>> 2. Most mongolian unicode data created using Mongolian script font, which
>>
>> has 15 years long correct default variants. In inner Mongolia used probably
>>
>> Mongolian Baiti. Mongolian Baiti was/is itself very unstable. For instance,
>>
>> as I know, it has in 2011 "Bichig" as "Bichig+fvs1" encoded. or? It means
>>
>> the existing mongolian unicode data is itself really not stable. If we
>>
>> change it to correct variant, we would implement normalisation tool for
>>
>> unicode mongolian data and distribute it freely.
>>
>> 3. I tend to think, the current default forms are not standardized globally.
>>
>> If not, can you redirect me and give me some references?
>>
>>
>>
>> Badral
>>
>>
>>
>>
>>
>> On 25.10.2015 13:48, Andrew West wrote:
>>
>>
>>
>> On 25 October 2015 at 03:11, Badral S. <badral@bolorsoft.com> wrote:
>>
>>
>>
>> 1. Why we should not switch current U+1828 medial and U+1828 medial +
>>
>> FSV1?
>>
>> 2. Why we should not switch current U+1833 medial and U+1833 medial +
>>
>> FSV1?
>>
>> 3. Why we should not switch current U+182D medial and U+182D medial +
>>
>> FSV1?
>>
>>
>>
>> Because it would destabilize existing Mongolian data.  In my opinion,
>>
>> we should not switch existing FVS's, even when the alternative would
>>
>> have made more sense for the reasons you mention.
>>
>>
>>
>> Andrew
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Badral Sanlig, Software architect
>>
>> www.bolorsoft.com | www.badral.net
>>
>> Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Badral Sanlig, Software architect
>>
>> www.bolorsoft.com | www.badral.net
>>
>> Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar


-- 
Badral Sanlig, Software architect
www.bolorsoft.com | www.badral.net
Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar

Received on Monday, 26 October 2015 10:45:55 UTC