Fwd: UAX#50 conformance: Is it possible to update existing fonts without causing damage to existing non-CSS applications? from MURATA Makoto on 2019-12-17 (www-archive@w3.org from December 2019)

From: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
Date: Tue, 17 Dec 2019 12:16:54 +0900
To: www-archive@w3.org
Message-ID: <CALvn5ECZitrE6fSnG5+QMTaX0RzNODb18Azbhx+EOj=6XqO4Yw@mail.gmail.com>

---------- Forwarded message ---------
From: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
Date: 2019年12月17日(火) 10:55
Subject: Re: UAX#50 conformance: Is it possible to update existing fonts
without causing damage to existing non-CSS applications?
To: Taro Yamamoto <tyamamot@adobe.com>
Cc: fantasai <fantasai@inkedblade.net>, Florian Rivoal <florian@rivoal.net>,
Nat McCully <nmccully@adobe.com>, MURATA Makoto (FAMILY Given) <
eb2m-mrt@asahi-net.or.jp>


I have been trying to compare two things: Adobe data and UAX#50.  I am not
done, but I am already embarrassed.

1. Adobe data

1) The cmap-resources, cid2code.txt, available at
https://github.com/adobe-type-tools/cmap-resources/tree/master/Adobe-Japan1-7
2) Vert features as specified in the GSUB template, aj17-gsub.fea,
available at
https://github.com/adobe-type-tools/Adobe-Japan1/tree/master/GSUB

Here 1) provides a mapping from Unicode code points to CIDs and 2) shows
which CID has the vert feature.  By combined 1) and 2), my F# program
provides a list of code points to which the vert feature will apply.

2. What is classified as Tr or Tu by UAX#50

I expected basically equivalent results.

But I am puzzled by the result.  There are many differences.

They are caused by cmap resources dedicated to vertical writing.  In other
words, for some character, vertical-writing cmap resources are used rather
than vert.  Such characters include:

30A0           ; Tr # Pd         KATAKANA-HIRAGANA DOUBLE HYPHEN
30A1           ; Tu # Lo         KATAKANA LETTER SMALL A
30A3           ; Tu # Lo         KATAKANA LETTER SMALL I
30A5           ; Tu # Lo         KATAKANA LETTER SMALL U
30A7           ; Tu # Lo         KATAKANA LETTER SMALL E
30A9           ; Tu # Lo         KATAKANA LETTER SMALL O
30C3           ; Tu # Lo         KATAKANA LETTER SMALL TU
30E3           ; Tu # Lo         KATAKANA LETTER SMALL YA
30E5           ; Tu # Lo         KATAKANA LETTER SMALL YU
30E7           ; Tu # Lo         KATAKANA LETTER SMALL YO
30EE           ; Tu # Lo         KATAKANA LETTER SMALL WA
30F5..30F6     ; Tu # Lo     [2] KATAKANA LETTER SMALL KA..KATAKANA LETTER
SMALL KE

I used below cmap columns in cid2code.txt.

# Column 22: Character codes for the "UniJIS-UTF32-H" and
#   "UniJIS-UTF32-V" CMaps (Unicode 13.0 UTF-32 encoding, proportional
#   Latin characters).
#
# o Column 25: Character codes for the "UniJIS2004-UTF32-H" and
#   "UniJIS2004-UTF32-V" CMaps (Unicode 13.0 UTF-32 encoding, proportional
#   Latin characters, JIS X 0213:2004 prototypical glyphs as the default).
#
# o Column 26: Character codes for the "UniJISX0213-UTF32-H" and
#   "UniJISX0213-UTF32-V" CMaps (Unicode 13.0 UTF-32 encoding,
#   proportional Latin characters, some proportional JIS X 0208:1997
#   characters).
#
# o Column 27: Character codes for the "UniJISX02132004-UTF32-H" and
#   "UniJISX02132004-UTF32-V" CMaps (Unicode 13.0 UTF-32 encoding,
#   proportional Latin characters, some proportional JIS X 0208:1997
#   characters, JIS X 0213:2004 prototypical glyphs as the default).


Regards,
Makoto



-- 
Regards,
Makoto

Received on Tuesday, 17 December 2019 03:17:36 UTC