Re: U+1885 / U+1886 changed from Letter to Mark from Andrew West on 2015-11-16 (public-i18n-mongolian@w3.org from October to December 2015)

From: Andrew West <andrewcwest@gmail.com>
Date: Mon, 16 Nov 2015 12:19:27 +0000
To: Greg Eck <greck@postone.net>
Cc: "public-i18n-mongolian@w3.org" <public-i18n-mongolian@w3.org>
Message-ID: <CALgEMhyBDtNV_HcHGVj5Qt5-teosVRCdEPGwPmkxxQ-MKDh2yA@mail.gmail.com>

Dear Greg,

I think that you are correct that the current properties for baludas
(1885/6) do not allow them to be correctly rendered on the right side of a
word as shown in the examples I posted previously (
http://www.babelstone.co.uk/Mongolian/TWYT_130.jpg).  Logically, the baluda
is an "other letter", but the layout model for Mongolian means that it has
to be treated as a non-spacing mark to be positioned correctly.

I therefore tentatively agree that we should propose changing the general
category of U+1885 and U+1886 from Lo (other letter) to Mn (non-spacing
mark), with a canonical combining class of 226 (positioned on the right), a
bidi class of NSM (non-spacing mark), and a line break property of CM
(attached characters and combining marks).

Incidentally, the corresponding Tibetan character, U+0F85 TIBETAN MARK
PALUTA, has a general category of Po (other punctuation), which is surely
incorrect, and something I may separately raise with the UTC.

Double incidentally, there may be a need to propose encoding a single and
triple circular paluta mark for use with Han characters.

Andrew

On 14 November 2015 at 02:41, Greg Eck <greck@postone.net> wrote:

> Andrew, Richard W,
>
> Do you have time to comment on the situation with regard to the
> U+1885/U+1886?
>
> Problem is that I don’t know how we can implement these two further as a
> diacritic unless we modify their feature set.
>
> In the BAITI implementation, the U+18A9 Dagalga shapes correctly (needs a
> small bit of refinement – but overall is correctly spaced on the left side
> of the preceeding character).
>
> I think we are in agreement from the earlier posted images of the Baluda/s
> that the Baluda should be placed to the right side of the preceeding
> character.
>
> It would seem that the category change to match the U+18A9 of “Mark,
> Non-spacing” would be appropriate.
>
> I am not sure what the COMBINE does.
>
> Does the BIDI parameter only affect sort/search?
>
> Thanks,
>
> Greg
>
>
>
> >>>>>
>
> *Sent:* Tuesday, November 10, 2015 1:10 AM
> *Subject:* U+1885 / U+1886 changed from Letter to Mark
>
>
>
> I had said earlier that the two Baludas (U+1885/1886) would probably be
> better processed as marks rather than letters.
>
> I find the following differences between the two Baludas and the one
> unquestionable mark in the Mongolian block – U+18A9 Dagalga …
>
>
>
>
>
> *U+18A9*
>
> *U+1885/1886*
>
> *CATEGORY*
>
> Mark, Nonspacing (MN)
>
> Letter, Other
>
> *COMBINE*
>
> 228
>
> 0 what does this do?
>
> *BIDI*
>
> Non-Spacing Mark
>
> Left-to-Right
>
> *Character.getDirectionality()*
>
> Directionality_Nonspacing_Mark[8]
>
> Left_to_Right 0
>
> *Character.getType()*
>
> 6
>
> 5
>
> *Character.isJavaIdentifierStart()*
>
> No
>
> Yes
>
> *Character.isLetter()*
>
> No
>
> Yes
>
> *Character.isLetterOrDigit()*
>
> No
>
> Yes
>
> *Character.isUnicodeIdentifierStart()*
>
> No
>
> Yes
>
>
>
> Given that the Baluda stations itself to the right of an existent vertical
> letter in similar fashion to the Dagalga stationing itself on the left side
> of the given vertical letter, I would say that we recommend redefining the
> features associated with the two Baludas to match the Dagalga. Then test it
> to verify that shaping behavior is as expected.
>
>
>
> If we made the above changes to the feature set of the U+1885/1886 would
> this allow us to shape the Baludas like we do the Dagalga?
>
>
>
> ArabicShaping.txt does not seem to make any distinction between the mark
> U+18A9 and the two Baludas.
>
>
>
> Greg
>
> >>>>>
>

Received on Monday, 16 November 2015 12:20:16 UTC