[css-text] Halfwidth Katakana and symbols in UAX#14

UAX#14 Unicode Line Breaking Algorithm[1] currently defines that:

1. Halfwidth Katakana letters (U+FF66, U+FF71-FF9E) as AL
2. Halfwidth with General Category=So/Sm (U+FFE8-FFEE) as AL

and Makoto Kato asked Unicode if it's appropriate to change #1 to ID. I
support his proposal, and also I suppose #2 should have the same classes as
their non-compatibility counterparts.

Since this has been defined so for 15 years, Unicode is interested in what
CSS WG and implementers would think, including if the change could have
adversely affects.

Thoughts? Can we resolve the response to Unicode?

Some additional information follow:

1. For #1, IE and FF already tailor to ID. Chrome and Safari follow UAX#14
today.
2. MS Office, RichEdit, NotePad, etc. tailor to ID. OS X TextEdit follows
UAX#14.
3. Other Halfwidth characters are:
  a. Halfwidth Hangul letters (U+FFA0-FFDC) are AL, and I think should be
unchanged
  b. Halfwidth small Katakana are CJ, and should be unchanged
  c. Halfwidth punctuation (Po/Ps/Pe) are CL/OP/NS, and should be unchanged
  d. Halfwidth Lm are CJ/NS, and should be unchanged

[1] http://unicode.org/reports/tr14/

/koji

Received on Sunday, 3 May 2015 17:22:09 UTC