W3C home > Mailing lists > Public > public-i18n-mongolian@w3.org > July to September 2015

Searching (was: FVS Assignment MisMatch)

From: Richard Wordingham <richard.wordingham@ntlworld.com>
Date: Mon, 3 Aug 2015 19:06:48 +0100
Cc: <public-i18n-mongolian@w3.org>
Message-ID: <20150803190648.6d64d2f9@JRWUBU2>
On Mon, 3 Aug 2015 08:16:26 +0900
<jrmt@almas.co.jp> wrote:

> Dear Mr. Richard
> > Is that true?  There may be more than two spellings that look the
> > same, but do they *sound* the same? As I understand it, the
> > Mongolian encoding represents sounds as well as appearance.  Are
> > Mongolian dictionaries sorted according to sound or according to
> > visual form?
> Yes you are right. They are sound different, the dictionary list the
> words in their *sound*. But most of the Mongolian people can not
> exactly distinguish which word is which. Even the linguistic expert
> make mistake without dictionary. But some times dictionary, listed
> them in different position, according to the authors point of view.
> For this reason, the text existing in public, remains so many wrong
> spelled words. When people read them, it is no problem, but when we
> search in the Google, we have to search each possible spelling. For
> example, we will search the word Mongolian ᠮᠣᠩᠭᠤᠯ  at least four
> times.

What is needed is some sort of folding, in the same way as Google
ignores the difference between upper and lower cases and often ignores
diacritics.  As a first approximation one should ignore the differences
A v. E, O v. U, and OE v. UE.  Possibly O and OE should also be folded;
that is where it becomes complicated.  Several consonant pairs should
also be folded, though a proper design may be complicated.

Received on Monday, 3 August 2015 18:07:19 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:07:04 UTC