- From: Andrew West <andrewcwest@gmail.com>
- Date: Tue, 4 Aug 2015 09:15:26 +0100
- To: Richard Wordingham <richard.wordingham@ntlworld.com>
- Cc: public-i18n-mongolian@w3.org
On 3 August 2015 at 23:31, Richard Wordingham <richard.wordingham@ntlworld.com> wrote: > > Does the Mongolian alphabet as claimed by the Unicode > codepoints work away from computers? What I get from that list is > the idea that something much closer to the Semitic original, largely > based on shape, should have been encoded. Well, yes, that would have been the expected encoding model based on Unicode encoding principles, and it is clear now (and has been for many years) that the current encoding model is deeply flawed and problematic. I hasten to add that this encoding model was not foisted on an unwilling user community by the Unicode Consortium, but was pushed for by experts from China and Mongolia (with the support of the Chinese and Mongolian national bodies). With hindsight the UTC and other ISO national bodies should have rejected this encoding model, but perhaps the implications were not fully understood at the time. I agree with Jirimutu that the Mongolian encoding model is the worst encoding model in Unicode, but I also agree that we are stuck with it, and that it is not possible to radically revise it at this stage. I think that the best we can do is mitigate the problems of multiple representation by defining fuzzy matching rules for Mongolian along the lines of Jirimutu's folding list, for use by search engines and text processing applications. This could be informally written up as a Unicode Technical Note, or formally defined somewhere in the Unicode character database. Andrew
Received on Tuesday, 4 August 2015 08:15:57 UTC