Feedback to 4 June 2009 ED from Japan

3.2 Algorithmic

cjk-ideographic
The cjk-ideographic algorithm is used by several numbering systems, using different sets of digits. These systems are defined for numbers greater than or equal to 0 and less than 1016. Numbers less than zero or equal to or greater than 1016 should use the decimal system. The core algorithm is as follows:
  1. Split the decimal number into groups of four digits, starting with the least significant digit.
  2. Ignoring groups that have the value zero, append the second group marker to the second group, the third group marker to the third group, and the fourth group marker to the fourth group. These markers are defined in the tables for the specific numbering systems. The first group has no marker.
  3. For each group, ignoring digits that have the value zero, append the second digit marker to the second digit, the third digit marker to the third digit, and the fourth digit marker to the fourth digit. These markers are defined in the tables for the specific numbering systems. The first digit has no marker.
  4. For any group with a value less than 20, remove the second digit (the 1 in the tens column). Leave any associated markers.
  5. Concatenate the groups back into a single string, least significant group last.
  6. Collapse any consecutive runs of 0 digits to a single 0.
  7. Replace each digit with the relevant character selected from the numbering system's table.

The suffix for the cjk-ideographic numbering systems is a dot (. U+002E FULL STOP). Is there a better suffix to use?

If specified explicitly, the cjk-ideograph keyword should be handled like either one of ''japanese-informal'', ''simp-chinese-informal'', or ''trad-chinese-informal''. The UA should use its own logic to determine which one to use.trad-chinese-informal.

japanese-formal
This uses the cjk-ideographic system with the following modifications to the rules and the following table.
  • Do not apply the rule 4.
  • In the rule 6, remove all 0 digits, not only consecutive runs of 0 digits. If nothing is left; i.e., if the original value before splitting into gruops is zero, use single digit 0.
Formal Japanese numbering system
Values Codepoints
Second Group MarkerU+842C4E07
Third Group MarkerU+5104
Fourth Group MarkerU+5146
Second Digit MarkerU+62FE
Third Digit MarkerU+767E4F70
Fourth Digit MarkerU+4EDF
Digit 0U+300796F6
Digit 1U+58F158F9
Digit 2U+5F108CB3
Digit 3U+53C23
Digit 4U+56DB8086
Digit 5U+4F0D
Digit 6U+516D9646
Digit 7U+4E0367D2
Digit 8U+516B634C
Digit 9U+4E5D7396

While these codepoints are chosen based on 1) bank rules for writing checks, and 2) compatibility with ECMA-376 (OOXML), some argued that these codepoints should be modern-ized, like 万 for second group marker or 千 for fourth digit marker. It would be ideal if UA/users can create their own @counter-style by specifying the algorithm and codepoints for markers, digits, and minus sign.

The suffix is U+3001 (、), and minus sign is U+FF0D (-). or U+002D or ''▲'' or ''マイナス'' (a string of four characters).

1010010 is 壱佰壱萬壱拾.

japanese-informal
This uses the cjk-ideographic system with the following modifications to the rules and the following table.
  • In the rule 3, if the digit is 1, the digit marker should be used without the digit 1.
  • The rule 4 is not necessary because the above modification to the rule 3 covers it.
  • In the rule 6, remove all 0 digits, not only consecutive runs of 0 digits. If nothing is left; i.e., if the original value before splitting into gruops is zero, use single digit 0.
Informal Japanese numbering system
Values Codepoints
Second Group MarkerU+4E07842C
Third Group MarkerU+5104
Fourth Group MarkerU+5146
Second Digit MarkerU+5341842C
Third Digit MarkerU+767E5104
Fourth Digit MarkerU+53435146
Digit 0U+300796F6
Digit 1U+4E0058F9
Digit 2U+4E8C8D30
Digit 3U+4E0953C1
Digit 4U+56DB8086
Digit 5U+4E944F0D
Digit 6U+516D9646
Digit 7U+4E0367D2
Digit 8U+516B634C
Digit 9U+4E5D7396

The suffix is U+3001 (、), and minus sign is U+FF0D (-). or U+002D or ''▲'' or ''マイナス'' (a string of four characters).

1010010 is 百一万十.

3.3 Numeric

<numeric>
arabic-indic | binary | bengali | cambodian | decimal | decimal-leading-zero | devanagari | gujarati | gurmukhi | japanese | kannada | khmer | lao | lower-hexadecimal | malayalam | mongolian | myanmar | octal | oriya | persian | super-decimal | telugu | tibetan | thai | upper-hexadecimal | urdu
Numeric repeating systems
System Characters Codepoints Base Suffix Notes
japanese 〇 一 二 三 四 五 六 七 八 九 U+3007, U+4E00, U+4E8C, U+4E09, U+56DB, U+4E94, U+516D, U+4E03, U+516B, U+4E5D 10 、 U+3001 Minus sign is - U+FF0D

3.4 Alphabetic

Alphabetic repeating systems
System Characters Codepoints Base Suffix Notes
cjk-earthly-branch 子 丑 寅 卯 辰 巳 午 未 申 酉 戌 亥 U+5B50, U+4E11, U+5BC5, U+536F, U+8FB0, U+5DF3, U+5348, U+672A, U+7533, U+9149, U+620C, U+4EA5 12 、 U+3001. U+002E U+FF0C for Chinese? Minus sign is - U+FF0D
cjk-heavenly-stem 甲 乙 丙 丁 戊 己 庚 辛 壬 癸 U+7532, U+4E59, U+4E19, U+4E01, U+620A, U+5DF1, U+5E9A, U+8F9B, U+58EC, U+7678 10 、 U+3001. U+002E U+FF0C for Chinese? Minus sign is - U+FF0D
hiragana-iroha い ろ は に ほ へ と ち り ぬ る を わ か よ た れ そ つ ね な ら む う ゐ の お く や ま け ふ こ え て あ さ き ゆ め み し ゑ ひ も せ す U+3044, U+308D, U+306F, U+306B, U+307B, U+3078, U+3068, U+3061, U+308A, U+306C, U+308B, U+3092, U+308F, U+304B, U+3088, U+305F, U+308C, U+305D, U+3064, U+306D, U+306A, U+3089, U+3080, U+3046, U+3090, U+306E, U+304A, U+304F, U+3084, U+307E, U+3051, U+3075, U+3053, U+3048, U+3066, U+3042, U+3055, U+304D, U+3086, U+3081, U+307F, U+3057, U+3091, U+3072, U+3082, U+305B, U+3059, U+3093 487 、 U+3001. U+002E Minus sign is - U+FF0D
hiragana あ い う え お か き く け こ さ し す せ そ た ち つ て と な に ぬ ね の は ひ ふ へ ほ ま み む め も や ゆ よ ら り る れ ろ わ ゐ ゑ を ん U+3042, U+3044, U+3046, U+3048, U+304A, U+304B, U+304D, U+304F, U+3051, U+3053, U+3055, U+3057, U+3059, U+305B, U+305D, U+305F, U+3061, U+3064, U+3066, U+3068, U+306A, U+306B, U+306C, U+306D, U+306E, U+306F, U+3072, U+3075, U+3078, U+307B, U+307E, U+307F, U+3080, U+3081, U+3082, U+3084, U+3086, U+3088, U+3089, U+308A, U+308B, U+308C, U+308D, U+308F, U+3090, U+3091, U+3092, U+3093 468 、 U+3001. U+002E Minus sign is - U+FF0D
katakana-iroha イ ロ ハ ニ ホ ヘ ト チ リ ヌ ル ヲ ワ カ ヨ タ レ ソ ツ ネ ナ ラ ム ウ ヰ ノ オ ク ヤ マ ケ フ コ エ テ ア サ キ ユ メ ミ シ ヱ ヒ モ セ ス U+30A4, U+30ED, U+30CF, U+30CB, U+30DB, U+30D8, U+30C8, U+30C1, U+30EA, U+30CC, U+30EB, U+30F2, U+30EF, U+30AB, U+30E8, U+30BF, U+30EC, U+30BD, U+30C4, U+30CD, U+30CA, U+30E9, U+30E0, U+30A6, U+30F0, U+30CE, U+30AA, U+30AF, U+30E4, U+30DE, U+30B1, U+30D5, U+30B3, U+30A8, U+30C6, U+30A2, U+30B5, U+30AD, U+30E6, U+30E1, U+30DF, U+30B7, U+30F1, U+30D2, U+30E2, U+30BB, U+30B9, U+30F3 487 、 U+3001. U+002E Minus sign is - U+FF0D
katakana ア イ ウ エ オ カ キ ク ケ コ サ シ ス セ ソ タ チ ツ テ ト ナ ニ ヌ ネ ノ ハ ヒ フ ヘ ホ マ ミ ム メ モ ヤ ユ ヨ ラ リ ル レ ロ ワ ヰ ヱ ヲ ン U+30A2, U+30A4, U+30A6, U+30A8, U+30AA, U+30AB, U+30AD, U+30AF, U+30B1, U+30B3, U+30B5, U+30B7, U+30B9, U+30BB, U+30BD, U+30BF, U+30C1, U+30C4, U+30C6, U+30C8, U+30CA, U+30CB, U+30CC, U+30CD, U+30CE, U+30CF, U+30D2, U+30D5, U+30D8, U+30DB, U+30DE, U+30DF, U+30E0, U+30E1, U+30E2, U+30E4, U+30E6, U+30E8, U+30E9, U+30EA, U+30EB, U+30EC, U+30ED, U+30EF, U+30F0, U+30F1, U+30F2, U+30F3 468 、 U+3001. U+002E Minus sign is - U+FF0D

3.6 Non-repeating

Non-repeating systems
System Range Codepoint Mapping
upper-roman-symbols 1-12
Value 1 2 3 ... 11 12
Character ...
Codepoint U+2160 U+2161 U+2162 ... U+216A U+216B
lower-roman-symbols 1-12
Value 1 2 3 ... 11 12
Character ...
Codepoint U+2170 U+2171 U+2172 ... U+217A U+217B