- From: Greg Eck <greck@postone.net>
- Date: Sat, 29 Aug 2015 08:57:48 +0000
- To: Richard Wordingham <richard.wordingham@ntlworld.com>, "public-i18n-mongolian@w3.org" <public-i18n-mongolian@w3.org>
- Message-ID: <BN3PR10MB03211C9F9DC516E981D33163AF6D0@BN3PR10MB0321.namprd10.prod.outlook.com>
Hi Richard, 1.) Final GA Examples Yes, I think the SIG example is the best known example also. I have attached a snip from the Chinese Standard. This is the line item that you were referring to? These two forms certainly carry the same meaning ("like or similar"). They certainly agree with the preceeding word in gender and form. We have discussed whether or not they should be connected by NNBSP and while it would convenient, the word is not considered to be a suffix and so currently the consensus is that, no, they are not connected by NNBSP. And we certainly do need a variation selector to differentiate the one from the other as there is no masculine vowel to distinguish the "masculine" form. My best attempt to explain the phenomenom would be to say that the U+1822 I, usually considered to be neutral in gender and therefore favoring the feminine, is in the case of [cid:image001.png@01D0E26C.0C9E2320] favoring the masculine, and therefore the final masculine sweep to the right. Personally, I consider them to be the masculine/feminine forms of the same word. Maybe the lemma is the feminine form and the variant is the masculine form. 2.) Final GA Specification & Toggle Can we have a bit of discussion about your statement below ... SNIP>>>>> There is therefore no need to add a variation selection for unexpected final masculine, such as Professor Quejingzhabu's 2000 document's <U+182D, U+180B>. The toggle behaviour is consistent with the graphics in the Unicode charts - Ken Whistler has already indicated that the UCD and code charts do not need to show that a variation selector acts as a toggle. SNIP>>>>> Does that mean that the following description SNIP>>>>> [cid:image003.jpg@01D0E27A.2AA9C140] SNIP>>>>> Should actually look more like this ... SNIP>>>>> [cid:image005.jpg@01D0E27A.2AA9C140] SNIP>>>>> ... with the toggle taken out? If so, then we are not communicating to font developers or end-users that there is actually a U+182D+FVS1 specification. Font developers will need to implement it. Typists will want to use it. Maybe I am misunderstanding your statement? 3.) Input Methods I take another read on the following TR170 statement ... SNIP>>>>> "The mechanism of inputting characters is not specified by the standard, so any keyboard driver capable of generating the appropriate 16-bit character encodings can be used. However, the input mechanism should ideally generate the correct positional forms, variants and ligatures on input by analysis of the context of each letter, at least where possible." SNIP>>>>> My read is that the emitted text string is the same. And what may be different is the input method. For example, we are experimenting with a keyboard that has an alternative mapping at SHFT-A for emitting the sequence <U+1820><U+180E>. Likewise for SHFT-E. This eliminates one keystroke. It eliminates dealing with the MVS directly. This experimental keyboard will emit identical text with the person who manually types the two keystrokes <U+1820><U+180E>. While I like the idea of a standard keyboard, I know that it is not possible until an army imposes it on the masses! If we have the time at the end of our discussion, we might talk about keyboards. Greg -----Original Message----- From: Richard Wordingham [mailto:richard.wordingham@ntlworld.com] Sent: Wednesday, August 26, 2015 3:11 PM To: public-i18n-mongolian@w3.org Subject: Re: FVS Assignment Mismatch WrapUp - GA On Sun, 23 Aug 2015 15:19:51 +0000 Greg Eck <greck@postone.net<mailto:greck@postone.net>> wrote: > I am ready to wrap up the discussion on FVS Assignment Mismatch. > > > > However I am still lacking good examples on two of the over-rides > discussed ... > > * 182D Medial - given the case where the contextual rules for > the dual dots must be over-ridden. In other words, the context > dictates that the medial GA is dotted, however, the actual shaping of > the word is desired without the dots. I have not had the time to track > down examples for this. > > * 182D Final - given the case where the feminine final GA > does not follow the common pattern of sweeping to the left, but > however sweeps to the right. In other words, the word is composed of > feminine vowels, but carries a masculine right-ward swept tail. From > discussions with Professor Quejngzhabu, I understand that there are > just a small subset of words (5-6 in quantity) that follow this > pattern. You have now found examples of 182D final while I was waiting for my copy of the Chinese standard (GB/T 26226-2010). I think the best example to quote is the one in Row 15 (sig) of Table 9 of GB/T 26226-2010 = Row 17 of table straddling pp5-6 of TR 170. (As a matter of curiosity, for I suspect it is irrelevant for rendering suffixes, are the words with the masculine final actually feminine?) Both documents show that final <U+182D, U+180B> is context sensitive - it has the opposite gender to final U+182D. There is therefore no need to add a variation selection for unexpected final masculine, such as Professor Quejingzhabu's 2000 document's <U+182D, U+180B>. The toggle behaviour is consistent with the graphics in the Unicode charts - Ken Whistler has already indicated that the UCD and code charts do not need to show that a variation selector acts as a toggle. On Mon, 24 Aug 2015 13:58:54 +0000 Greg Eck <greck@postone.net<mailto:greck@postone.net>> wrote: > I am still not sure that we have a case for the Medial GA undotted > over-ride FVS. More thoughts here ... ? Initial and medial <U+182D, U+180B> are also context-sensitive, having the opposite dot setting to U+182D. I have probably oversimplified the rules expressed in the attachments to http://www.unicode.org/~asmus/mongolian/MD016-alldraft-01.html , worked on by Martin Heijdra and Timothy Partridge. Unfortunately, that work seems to use slightly different variation sequences to the ISO/Unicode and Chinese standards. As I said before, work on variant forms for connected text cannot be done properly without identification of the contextual shaping rules. It is also a bad idea to try to include grammar rules in rendering rules. For example, some dialects have the rule that /e/ in non-initial syllables does not overrule an earlier masculine vowel when it comes to the gender for the suffixes, and it seems that that also applies to final consonants. It seems much simpler, and flexible, to ignore that rule. There was some horrifying text in TR170: "The mechanism of inputting characters is not specified by the standard, so any keyboard driver capable of generating the appropriate 16-bit character encodings can be used. However, the input mechanism should ideally generate the correct positional forms, variants and ligatures on input by analysis of the context of each letter, at least where possible." This suggests that the contextual rules could vary from input system to input system, with decisions on rendering being stored by the use of the PUA. This is obviously not consistent with the idea of storing Mongolian text just using assigned Unicode codepoints. Richard.
Attachments
- image/png attachment: image001.png
- image/jpeg attachment: image003.jpg
- image/jpeg attachment: image005.jpg
- image/jpeg attachment: Capture.JPG
Received on Saturday, 29 August 2015 08:58:25 UTC