Re: Issues with DA,NA,GA default medial variants

Hi Greg,
The words are not stems.dugaar, dugeer, daa, dee, dahi are suffixes. 
Thus, the frequency is high. dun, dugnelt are also  frequently occurred.

Badral

On 26.10.2015 14:47, Greg Eck wrote:
> I meant to add the images of the 27 items using FVS1 …
> Greg
> Here is my count of FVSx in my stem database:
> Total items – 17687
> Total foreign – 4561
> Total non-foreign – 13126
> Total FVS1 count – 27
> Total FVS2 count – 0
> Total FVS3 count – 1
> This assumes the use of Mongolian Baiti as per our current test 
> version (not shipped yet)
> Greg
> Lists below come from the BabelPad reporting tools – Thanks Andrew ….
> *Character usage count*
> Code point      Character Character Name  Count
> 0000A0          NO-BREAK SPACE  76
> *00180B **᠋ **MONGOLIAN FREE VARIATION SELECTOR ONE   27*
> *00180D **᠍ **MONGOLIAN FREE VARIATION SELECTOR THREE 1*
> 00180E   MONGOLIAN VOWEL SEPARATOR 981
> 001820 ᠠ MONGOLIAN LETTER A      12,915
> 001821 ᠡ MONGOLIAN LETTER E      7,707
> 001822 ᠢ MONGOLIAN LETTER I      8,277
> 001823 ᠣ MONGOLIAN LETTER O      1,956
> 001824 ᠤ MONGOLIAN LETTER U      6,155
> 001825 ᠥ MONGOLIAN LETTER OE     1,017
> 001826 ᠦ MONGOLIAN LETTER UE     3,512
> 001827 ᠧ MONGOLIAN LETTER EE     11
> 001828 ᠨ MONGOLIAN LETTER NA     3,020
> 001829 ᠩ MONGOLIAN LETTER ANG    1,138
> 00182A ᠪ MONGOLIAN LETTER BA     3,089
> 00182B ᠫ MONGOLIAN LETTER PA     94
> 00182C ᠬ MONGOLIAN LETTER QA     4,033
> 00182D ᠭ MONGOLIAN LETTER GA     8,341
> 00182E ᠮ MONGOLIAN LETTER MA     2,157
> 00182F ᠯ MONGOLIAN LETTER LA     5,752
> 001830 ᠰ MONGOLIAN LETTER SA     3,226
> 001831 ᠱ MONGOLIAN LETTER SHA    299
> 001832 ᠲ MONGOLIAN LETTER TA     2,982
> 001833 ᠳ MONGOLIAN LETTER DA     3,853
> 001834 ᠴ MONGOLIAN LETTER CHA    2,227
> 001835 ᠵ MONGOLIAN LETTER JA     2,340
> 001836 ᠶ MONGOLIAN LETTER YA     1,691
> 001837 ᠷ MONGOLIAN LETTER RA     5,844
> 001838 ᠸ MONGOLIAN LETTER WA     29
> 00183A ᠺ MONGOLIAN LETTER KA     11
> 00183C ᠼ MONGOLIAN LETTER TSA    1
> 00183D ᠽ MONGOLIAN LETTER ZA     1
> 001840 ᡀ MONGOLIAN LETTER LHA    1
> 00202F          NARROW NO-BREAK SPACE   281
> *Stem list using FVS1*
> ᠠ᠋
> ᠠ᠋
> ᠣᠳᠠᠭᠠᠨᠲ᠋ᠡᠩᠷᠢ
> ᠫᠢᠨᠲ᠋ᠦᠦ
> ᠫᠦᠨᠲ᠋ᠢᠦᠵᠡ
> ᠬᠦᠴᠦᠯᠲᠦ᠋ᠷᠦᠭᠴᠢ
> ᠳ᠋ᠠ
> ᠳ᠋ᠠ
> ᠳ᠋ᠠᠬᠢ
> ᠳ᠋ᠡ
> ᠳ᠋ᠡ
> ᠳ᠋ᠡᠩᠳ᠋ᠤᠩ
> ᠳ᠋ᠡᠩᠲᠡᠢᠳ᠋ᠤᠩᠲᠠᠢ
> ᠳ᠋ᠣᠩᠭ ᠠ
> ᠳ᠋ᠤᠭᠠᠷ
> ᠳ᠋ᠤᠭᠠᠷᠯᠠᠯ
> ᠳ᠋ᠦᠨᠵᠡ
> ᠳ᠋ᠦᠩ
> ᠳ᠋ᠦᠩᠨᠡᠯᠲᠡ
> ᠳ᠋ᠦᠩᠰᠢᠭᠦᠷ
> ᠳ᠋ᠦᠭᠡᠷ
> ᠳ᠋ᠧᠩᠯᠦ
> ᠵᠠᠰᠲ᠋ᠠᠸ
> ᠶᠠᠪᠤᠭᠠᠨᠳ᠋ᠠᠭᠠᠨ
> ᠶᠡᠷᠦᠳ᠋ᠡᠭᠡᠨ
> Conversion of FVS list above to code-point
> U+1820 U+180B
> U+1820 U+180B
> U+1823 U+1833 U+1820 U+182D U+1820 U+1828  U+1832 U+180B U+1821 U+1829 
> U+1837 U+1822
> U+182B U+1822 U+1828 U+1832 U+180B U+1826 U+1826
> U+182B U+1826 U+1828 U+1832 U+180B U+1822 U+1826 U+1835 U+1821
> U+182C U+1826 U+1834 U+1826 U+182F U+1832 U+1826 U+180B U+1837 U+1826 
> U+182D U+1834 U+1822
> U+1833 U+180B U+1820
> U+1833 U+180B U+1820
> U+1833 U+180B U+1820 U+182C U+1822
> U+1833 U+180B U+1821
> U+1833 U+180B U+1821
> U+1833 U+180B U+1821 U+1829  U+1833 U+180B U+1824 U+1829
> U+1833 U+180B U+1821 U+1829 U+1832 U+1821 U+1822  U+1833 U+180B U+1824 
> U+1829 U+1832 U+1820 U+1822
> U+1833 U+180B U+1823 U+1829 U+182D U+180E U+1820
> U+1833 U+180B U+1824 U+182D U+1820 U+1837
> U+1833 U+180B U+1824 U+182D U+1820 U+1837 U+182F U+1820 U+182F
> U+1833 U+180B U+1826 U+1828 U+1835 U+1821
> U+1833 U+180B U+1826 U+1829
> U+1833 U+180B U+1826 U+1829 U+1828 U+1821 U+182F U+1832 U+1821
> U+1833 U+180B U+1826 U+1829 U+1830 U+1822 U+182D U+1826 U+1837
> U+1833 U+180B U+1826 U+182D U+1821 U+1837
> U+1833 U+180B U+1827 U+1829 U+182F U+1826
> U+1835 U+1820 U+1830 U+1832 U+180B U+1820 U+1838
> U+1836 U+1820 U+182A U+1824 U+182D U+1820 U+1828  U+1833 U+180B U+1820 
> U+182D U+1820 U+1828
> U+1836 U+1821 U+1837 U+1826  U+1833 U+180B U+1821 U+182D U+1821 U+1828
>
> *>>>>>*
> *Sent:* Monday, October 26, 2015 8:51 PM
> *Subject:* RE: Issues with DA,NA,GA default medial variants
> I am still reading through the emails of the day, so will take a bit 
> to respond.
> One thing that is a bit alarming however is the concern about FVS 
> usage. I consider the amount of FVS usage in daily contemporary 
> language to be pretty low. Can we do this so that we have some 
> quantifiable data to compare against. *Let's each of us take our 
> lexical stem database and count the FVS1/2/3 usage. I will start and 
> be back with you shortly. If we could sort out foreign words that 
> would be even better.* How difficult would this be for either of you? 
> Others of course are welcome to join in.
> Greg
> >>>>>


-- 
Badral Sanlig, Software architect
www.bolorsoft.com | www.badral.net
Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar

Received on Thursday, 12 November 2015 20:22:08 UTC