- From: Badral S. <badral@bolorsoft.com>
- Date: Thu, 12 Nov 2015 21:21:36 +0100
- To: public-i18n-mongolian@w3.org
- Message-ID: <5644F4D0.8010805@bolorsoft.com>
Hi Greg, The words are not stems.dugaar, dugeer, daa, dee, dahi are suffixes. Thus, the frequency is high. dun, dugnelt are also frequently occurred. Badral On 26.10.2015 14:47, Greg Eck wrote: > I meant to add the images of the 27 items using FVS1 … > Greg > Here is my count of FVSx in my stem database: > Total items – 17687 > Total foreign – 4561 > Total non-foreign – 13126 > Total FVS1 count – 27 > Total FVS2 count – 0 > Total FVS3 count – 1 > This assumes the use of Mongolian Baiti as per our current test > version (not shipped yet) > Greg > Lists below come from the BabelPad reporting tools – Thanks Andrew …. > *Character usage count* > Code point Character Character Name Count > 0000A0 NO-BREAK SPACE 76 > *00180B **᠋ **MONGOLIAN FREE VARIATION SELECTOR ONE 27* > *00180D **᠍ **MONGOLIAN FREE VARIATION SELECTOR THREE 1* > 00180E MONGOLIAN VOWEL SEPARATOR 981 > 001820 ᠠ MONGOLIAN LETTER A 12,915 > 001821 ᠡ MONGOLIAN LETTER E 7,707 > 001822 ᠢ MONGOLIAN LETTER I 8,277 > 001823 ᠣ MONGOLIAN LETTER O 1,956 > 001824 ᠤ MONGOLIAN LETTER U 6,155 > 001825 ᠥ MONGOLIAN LETTER OE 1,017 > 001826 ᠦ MONGOLIAN LETTER UE 3,512 > 001827 ᠧ MONGOLIAN LETTER EE 11 > 001828 ᠨ MONGOLIAN LETTER NA 3,020 > 001829 ᠩ MONGOLIAN LETTER ANG 1,138 > 00182A ᠪ MONGOLIAN LETTER BA 3,089 > 00182B ᠫ MONGOLIAN LETTER PA 94 > 00182C ᠬ MONGOLIAN LETTER QA 4,033 > 00182D ᠭ MONGOLIAN LETTER GA 8,341 > 00182E ᠮ MONGOLIAN LETTER MA 2,157 > 00182F ᠯ MONGOLIAN LETTER LA 5,752 > 001830 ᠰ MONGOLIAN LETTER SA 3,226 > 001831 ᠱ MONGOLIAN LETTER SHA 299 > 001832 ᠲ MONGOLIAN LETTER TA 2,982 > 001833 ᠳ MONGOLIAN LETTER DA 3,853 > 001834 ᠴ MONGOLIAN LETTER CHA 2,227 > 001835 ᠵ MONGOLIAN LETTER JA 2,340 > 001836 ᠶ MONGOLIAN LETTER YA 1,691 > 001837 ᠷ MONGOLIAN LETTER RA 5,844 > 001838 ᠸ MONGOLIAN LETTER WA 29 > 00183A ᠺ MONGOLIAN LETTER KA 11 > 00183C ᠼ MONGOLIAN LETTER TSA 1 > 00183D ᠽ MONGOLIAN LETTER ZA 1 > 001840 ᡀ MONGOLIAN LETTER LHA 1 > 00202F NARROW NO-BREAK SPACE 281 > *Stem list using FVS1* > ᠠ᠋ > ᠠ᠋ > ᠣᠳᠠᠭᠠᠨᠲ᠋ᠡᠩᠷᠢ > ᠫᠢᠨᠲ᠋ᠦᠦ > ᠫᠦᠨᠲ᠋ᠢᠦᠵᠡ > ᠬᠦᠴᠦᠯᠲᠦ᠋ᠷᠦᠭᠴᠢ > ᠳ᠋ᠠ > ᠳ᠋ᠠ > ᠳ᠋ᠠᠬᠢ > ᠳ᠋ᠡ > ᠳ᠋ᠡ > ᠳ᠋ᠡᠩᠳ᠋ᠤᠩ > ᠳ᠋ᠡᠩᠲᠡᠢᠳ᠋ᠤᠩᠲᠠᠢ > ᠳ᠋ᠣᠩᠭ ᠠ > ᠳ᠋ᠤᠭᠠᠷ > ᠳ᠋ᠤᠭᠠᠷᠯᠠᠯ > ᠳ᠋ᠦᠨᠵᠡ > ᠳ᠋ᠦᠩ > ᠳ᠋ᠦᠩᠨᠡᠯᠲᠡ > ᠳ᠋ᠦᠩᠰᠢᠭᠦᠷ > ᠳ᠋ᠦᠭᠡᠷ > ᠳ᠋ᠧᠩᠯᠦ > ᠵᠠᠰᠲ᠋ᠠᠸ > ᠶᠠᠪᠤᠭᠠᠨᠳ᠋ᠠᠭᠠᠨ > ᠶᠡᠷᠦᠳ᠋ᠡᠭᠡᠨ > Conversion of FVS list above to code-point > U+1820 U+180B > U+1820 U+180B > U+1823 U+1833 U+1820 U+182D U+1820 U+1828 U+1832 U+180B U+1821 U+1829 > U+1837 U+1822 > U+182B U+1822 U+1828 U+1832 U+180B U+1826 U+1826 > U+182B U+1826 U+1828 U+1832 U+180B U+1822 U+1826 U+1835 U+1821 > U+182C U+1826 U+1834 U+1826 U+182F U+1832 U+1826 U+180B U+1837 U+1826 > U+182D U+1834 U+1822 > U+1833 U+180B U+1820 > U+1833 U+180B U+1820 > U+1833 U+180B U+1820 U+182C U+1822 > U+1833 U+180B U+1821 > U+1833 U+180B U+1821 > U+1833 U+180B U+1821 U+1829 U+1833 U+180B U+1824 U+1829 > U+1833 U+180B U+1821 U+1829 U+1832 U+1821 U+1822 U+1833 U+180B U+1824 > U+1829 U+1832 U+1820 U+1822 > U+1833 U+180B U+1823 U+1829 U+182D U+180E U+1820 > U+1833 U+180B U+1824 U+182D U+1820 U+1837 > U+1833 U+180B U+1824 U+182D U+1820 U+1837 U+182F U+1820 U+182F > U+1833 U+180B U+1826 U+1828 U+1835 U+1821 > U+1833 U+180B U+1826 U+1829 > U+1833 U+180B U+1826 U+1829 U+1828 U+1821 U+182F U+1832 U+1821 > U+1833 U+180B U+1826 U+1829 U+1830 U+1822 U+182D U+1826 U+1837 > U+1833 U+180B U+1826 U+182D U+1821 U+1837 > U+1833 U+180B U+1827 U+1829 U+182F U+1826 > U+1835 U+1820 U+1830 U+1832 U+180B U+1820 U+1838 > U+1836 U+1820 U+182A U+1824 U+182D U+1820 U+1828 U+1833 U+180B U+1820 > U+182D U+1820 U+1828 > U+1836 U+1821 U+1837 U+1826 U+1833 U+180B U+1821 U+182D U+1821 U+1828 > > *>>>>>* > *Sent:* Monday, October 26, 2015 8:51 PM > *Subject:* RE: Issues with DA,NA,GA default medial variants > I am still reading through the emails of the day, so will take a bit > to respond. > One thing that is a bit alarming however is the concern about FVS > usage. I consider the amount of FVS usage in daily contemporary > language to be pretty low. Can we do this so that we have some > quantifiable data to compare against. *Let's each of us take our > lexical stem database and count the FVS1/2/3 usage. I will start and > be back with you shortly. If we could sort out foreign words that > would be even better.* How difficult would this be for either of you? > Others of course are welcome to join in. > Greg > >>>>> -- Badral Sanlig, Software architect www.bolorsoft.com | www.badral.net Bolorsoft LLC, Selbe Khotkhon 40/4 D2, District 11, Ulaanbaatar
Received on Thursday, 12 November 2015 20:22:08 UTC