RE: U+1885 / U+1886 changed from Letter to Mark

Hi Andrew,

Thanks for that affirmation and also the specifics on what would have to change if the proposal is accepted.

Greg


From: Andrew West [mailto:andrewcwest@gmail.com]
Sent: Monday, November 16, 2015 8:19 PM
To: Greg Eck <greck@postone.net>
Cc: public-i18n-mongolian@w3.org
Subject: Re: U+1885 / U+1886 changed from Letter to Mark

Dear Greg,

I think that you are correct that the current properties for baludas (1885/6) do not allow them to be correctly rendered on the right side of a word as shown in the examples I posted previously (http://www.babelstone.co.uk/Mongolian/TWYT_130.jpg).  Logically, the baluda is an "other letter", but the layout model for Mongolian means that it has to be treated as a non-spacing mark to be positioned correctly.

I therefore tentatively agree that we should propose changing the general category of U+1885 and U+1886 from Lo (other letter) to Mn (non-spacing mark), with a canonical combining class of 226 (positioned on the right), a bidi class of NSM (non-spacing mark), and a line break property of CM (attached characters and combining marks).

Incidentally, the corresponding Tibetan character, U+0F85 TIBETAN MARK PALUTA, has a general category of Po (other punctuation), which is surely incorrect, and something I may separately raise with the UTC.

Double incidentally, there may be a need to propose encoding a single and triple circular paluta mark for use with Han characters.

Andrew



On 14 November 2015 at 02:41, Greg Eck <greck@postone.net<mailto:greck@postone.net>> wrote:
Andrew, Richard W,
Do you have time to comment on the situation with regard to the U+1885/U+1886?
Problem is that I don’t know how we can implement these two further as a diacritic unless we modify their feature set.
In the BAITI implementation, the U+18A9 Dagalga shapes correctly (needs a small bit of refinement – but overall is correctly spaced on the left side of the preceeding character).
I think we are in agreement from the earlier posted images of the Baluda/s that the Baluda should be placed to the right side of the preceeding character.
It would seem that the category change to match the U+18A9 of “Mark, Non-spacing” would be appropriate.
I am not sure what the COMBINE does.
Does the BIDI parameter only affect sort/search?
Thanks,
Greg

>>>>>
Sent: Tuesday, November 10, 2015 1:10 AM
Subject: U+1885 / U+1886 changed from Letter to Mark

I had said earlier that the two Baludas (U+1885/1886) would probably be better processed as marks rather than letters.
I find the following differences between the two Baludas and the one unquestionable mark in the Mongolian block – U+18A9 Dagalga …



U+18A9

U+1885/1886

CATEGORY

Mark, Nonspacing (MN)

Letter, Other

COMBINE

228

0 what does this do?

BIDI

Non-Spacing Mark

Left-to-Right

Character.getDirectionality()

Directionality_Nonspacing_Mark[8]

Left_to_Right 0

Character.getType()

6

5

Character.isJavaIdentifierStart()

No

Yes

Character.isLetter()

No

Yes

Character.isLetterOrDigit()

No

Yes

Character.isUnicodeIdentifierStart()

No

Yes


Given that the Baluda stations itself to the right of an existent vertical letter in similar fashion to the Dagalga stationing itself on the left side of the given vertical letter, I would say that we recommend redefining the features associated with the two Baludas to match the Dagalga. Then test it to verify that shaping behavior is as expected.

If we made the above changes to the feature set of the U+1885/1886 would this allow us to shape the Baludas like we do the Dagalga?

ArabicShaping.txt does not seem to make any distinction between the mark U+18A9 and the two Baludas.

Greg
>>>>>

Received on Tuesday, 17 November 2015 02:10:52 UTC