W3C home > Mailing lists > Public > public-i18n-mongolian@w3.org > October to December 2015

Re: Issues with DA,NA,GA default medial variants

From: Badral S. <badral@bolorsoft.com>
Date: Thu, 12 Nov 2015 21:21:36 +0100
To: public-i18n-mongolian@w3.org
Message-ID: <5644F4D0.8010805@bolorsoft.com>
Hi Greg,
The words are not stems.dugaar, dugeer, daa, dee, dahi are suffixes. 
Thus, the frequency is high. dun, dugnelt are also  frequently occurred.

Badral

On 26.10.2015 14:47, Greg Eck wrote:
> I meant to add the images of the 27 items using FVS1 …
> Greg
> Here is my count of FVSx in my stem database:
> Total items – 17687
> Total foreign – 4561
> Total non-foreign – 13126
> Total FVS1 count – 27
> Total FVS2 count – 0
> Total FVS3 count – 1
> This assumes the use of Mongolian Baiti as per our current test 
> version (not shipped yet)
> Greg
> Lists below come from the BabelPad reporting tools – Thanks Andrew ….
> *Character usage count*
> Code point      Character Character Name  Count
> 0000A0          NO-BREAK SPACE  76
> *00180B **᠋ **MONGOLIAN FREE VARIATION SELECTOR ONE   27*
> *00180D **᠍ **MONGOLIAN FREE VARIATION SELECTOR THREE 1*
> 00180E ᠎ MONGOLIAN VOWEL SEPARATOR 981
> 001820 ᠠ MONGOLIAN LETTER A      12,915
> 001821 ᠡ MONGOLIAN LETTER E      7,707
> 001822 ᠢ MONGOLIAN LETTER I      8,277
> 001823 ᠣ MONGOLIAN LETTER O      1,956
> 001824 ᠤ MONGOLIAN LETTER U      6,155
> 001825 ᠥ MONGOLIAN LETTER OE     1,017
> 001826 ᠦ MONGOLIAN LETTER UE     3,512
> 001827 ᠧ MONGOLIAN LETTER EE     11
> 001828 ᠨ MONGOLIAN LETTER NA     3,020
> 001829 ᠩ MONGOLIAN LETTER ANG    1,138
> 00182A ᠪ MONGOLIAN LETTER BA     3,089
> 00182B ᠫ MONGOLIAN LETTER PA     94
> 00182C ᠬ MONGOLIAN LETTER QA     4,033
> 00182D ᠭ MONGOLIAN LETTER GA     8,341
> 00182E ᠮ MONGOLIAN LETTER MA     2,157
> 00182F ᠯ MONGOLIAN LETTER LA     5,752
> 001830 ᠰ MONGOLIAN LETTER SA     3,226
> 001831 ᠱ MONGOLIAN LETTER SHA    299
> 001832 ᠲ MONGOLIAN LETTER TA     2,982
> 001833 ᠳ MONGOLIAN LETTER DA     3,853
> 001834 ᠴ MONGOLIAN LETTER CHA    2,227
> 001835 ᠵ MONGOLIAN LETTER JA     2,340
> 001836 ᠶ MONGOLIAN LETTER YA     1,691
> 001837 ᠷ MONGOLIAN LETTER RA     5,844
> 001838 ᠸ MONGOLIAN LETTER WA     29
> 00183A ᠺ MONGOLIAN LETTER KA     11
> 00183C ᠼ MONGOLIAN LETTER TSA    1
> 00183D ᠽ MONGOLIAN LETTER ZA     1
> 001840 ᡀ MONGOLIAN LETTER LHA    1
> 00202F          NARROW NO-BREAK SPACE   281
> *Stem list using FVS1*
> ᠠ᠋
> ᠠ᠋
> ᠣᠳᠠᠭᠠᠨᠲ᠋ᠡᠩᠷᠢ
> ᠫᠢᠨᠲ᠋ᠦᠦ
> ᠫᠦᠨᠲ᠋ᠢᠦᠵᠡ
> ᠬᠦᠴᠦᠯᠲᠦ᠋ᠷᠦᠭᠴᠢ
> ᠳ᠋ᠠ
> ᠳ᠋ᠠ
> ᠳ᠋ᠠᠬᠢ
> ᠳ᠋ᠡ
> ᠳ᠋ᠡ
> ᠳ᠋ᠡᠩᠳ᠋ᠤᠩ
> ᠳ᠋ᠡᠩᠲᠡᠢᠳ᠋ᠤᠩᠲᠠᠢ
> ᠳ᠋ᠣᠩᠭ᠎ᠠ
> ᠳ᠋ᠤᠭᠠᠷ
> ᠳ᠋ᠤᠭᠠᠷᠯᠠᠯ
> ᠳ᠋ᠦᠨᠵᠡ
> ᠳ᠋ᠦᠩ
> ᠳ᠋ᠦᠩᠨᠡᠯᠲᠡ
> ᠳ᠋ᠦᠩᠰᠢᠭᠦᠷ
> ᠳ᠋ᠦᠭᠡᠷ
> ᠳ᠋ᠧᠩᠯᠦ
> ᠵᠠᠰᠲ᠋ᠠᠸ
> ᠶᠠᠪᠤᠭᠠᠨᠳ᠋ᠠᠭᠠᠨ
> ᠶᠡᠷᠦᠳ᠋ᠡᠭᠡᠨ
> Conversion of FVS list above to code-point
> U+1820 U+180B
> U+1820 U+180B
> U+1823 U+1833 U+1820 U+182D U+1820 U+1828  U+1832 U+180B U+1821 U+1829 
> U+1837 U+1822
> U+182B U+1822 U+1828 U+1832 U+180B U+1826 U+1826
> U+182B U+1826 U+1828 U+1832 U+180B U+1822 U+1826 U+1835 U+1821
> U+182C U+1826 U+1834 U+1826 U+182F U+1832 U+1826 U+180B U+1837 U+1826 
> U+182D U+1834 U+1822
> U+1833 U+180B U+1820
> U+1833 U+180B U+1820
> U+1833 U+180B U+1820 U+182C U+1822
> U+1833 U+180B U+1821
> U+1833 U+180B U+1821
> U+1833 U+180B U+1821 U+1829  U+1833 U+180B U+1824 U+1829
> U+1833 U+180B U+1821 U+1829 U+1832 U+1821 U+1822  U+1833 U+180B U+1824 
> U+1829 U+1832 U+1820 U+1822
> U+1833 U+180B U+1823 U+1829 U+182D U+180E U+1820
> U+1833 U+180B U+1824 U+182D U+1820 U+1837
> U+1833 U+180B U+1824 U+182D U+1820 U+1837 U+182F U+1820 U+182F
> U+1833 U+180B U+1826 U+1828 U+1835 U+1821
> U+1833 U+180B U+1826 U+1829
> U+1833 U+180B U+1826 U+1829 U+1828 U+1821 U+182F U+1832 U+1821
> U+1833 U+180B U+1826 U+1829 U+1830 U+1822 U+182D U+1826 U+1837
> U+1833 U+180B U+1826 U+182D U+1821 U+1837
> U+1833 U+180B U+1827 U+1829 U+182F U+1826
> U+1835 U+1820 U+1830 U+1832 U+180B U+1820 U+1838
> U+1836 U+1820 U+182A U+1824 U+182D U+1820 U+1828  U+1833 U+180B U+1820 
> U+182D U+1820 U+1828
> U+1836 U+1821 U+1837 U+1826  U+1833 U+180B U+1821 U+182D U+1821 U+1828
>
> *>>>>>*
> *Sent:* Monday, October 26, 2015 8:51 PM
> *Subject:* RE: Issues with DA,NA,GA default medial variants
> I am still reading through the emails of the day, so will take a bit 
> to respond.
> One thing that is a bit alarming however is the concern about FVS 
> usage. I consider the amount of FVS usage in daily contemporary 
> language to be pretty low. Can we do this so that we have some 
> quantifiable data to compare against. *Let's each of us take our 
> lexical stem database and count the FVS1/2/3 usage. I will start and 
> be back with you shortly. If we could sort out foreign words that 
> would be even better.* How difficult would this be for either of you? 
> Others of course are welcome to join in.
> Greg
> >>>>>


-- 
Badral Sanlig, Software architect
www.bolorsoft.com | www.badral.net
Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar
Received on Thursday, 12 November 2015 20:22:08 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:07:45 UTC