Re: [csswg-drafts] [css-text-3] Should enclosed ideographic blocks be space-discarding? (#4992)

Also for Japanese the general feeling is that enclosed ideographic is not that important. To provide a bit more analysis, characters in these blocks are:

① enclosed number of kana ① ㈠ ㊂ ㋑
② enclosed days of week or other kanji ㈪ ㊊ ㈱ ㋿
③ ARIB (Association of Radio Industries and Businesses) - 🉇 🉈 🈙

I see category ① more often than others. They are used as list headings. As list headings they typically appear after explicit line breaks and therefore the transformation rules is not that relevant. Of couse they can be used as headers of inline lists, or other places in a line. For these cases they should be treated the same as enclosed Arabic numbers. Otherwise many ordinary people would be puzzled why a space is inserted around ① but not around ㊀.

The category ② is legacy combining characters. They typically come before or after a noun “12日㈪” (12th Monday) and ㈱アップル (Apple incorporated). Probably people would not insert a line break in between. I saw them in the past more often but I feel the use is decreasing in favour of fully spelling them like 月曜/月曜日 (Monday) or 株式会社 (corporation). Please refer to the usage counts below.

The category ③ is special purpose characters used by TV and not in general use. I googled them for these characters but most found pages were about these unicode character themselves.

Here is non-scientific use-count obtained by google searching each character within parentheses. The second circled number denotes the category I used above.
① ① 446M
㈱ ② 31M vs “株式会社” 1.3G
⑴ ① 8.3M
㈠ ① 6.0M
⒈ ① 3.4M
㋐ ① 2.3M
㍿ ② 1.3M
㈪ ② 0.85M vs “日月曜日” 22M, “月曜” 74M, or “月曜日” 225M
㊊ ② 0.34M
🉇 ③ 0.19M
🈙 ③ 0.19M
㊀ ① 0.15M
㍼ ② 0.15M
㋿ ② 0.054M

In sum, they are not frequent characters and if they used they are often used in a context where the transformation rule is not that important.

I believe it is more important that all enclosed numbers are treated the same way regardless of if the number is in Arabic style or in ideographic style. Actually probably all enclosed letters and numbers should be treated in a consistent manner.

-- 
GitHub Notification of comment by kidayasuo
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/4992#issuecomment-633851562 using your GitHub account

Received on Tuesday, 26 May 2020 07:13:07 UTC