- From: Eric Muller <emuller@amazon.com>
- Date: Sun, 18 Oct 2020 16:20:42 -0700
- To: <public-jlreq-admin@w3.org>
- CC: Stanton Marcum <stantonm@amazon.com>
On 10/14/20 1:32 AM, 木田泰夫 wrote:
> sorry English speakers but I would appreciate it if you could send
> your comments on the email list
Here is our perspective as implementers. It is a bit raw (sorry, we
noticed the announcement a bit late), don't hesitate to reach out for
clarification.
Eric.
---
Character classes serve two purposes: linebreak opportunities and
spacing around characters.
Linebreak opportunities are adequately handled by Unicode currently, at
most needing some adjustment in UAX14 or in the CLDR language
tailorings. Therefore that use is not discussed here.
---
A possible spacing model is that there is glue (variable space) on each
side of each grapheme cluster occurrence. This glue is characterized by
its natural width (JLREQ appendix B) and can be deformed (either
compressed - JLREQ appendix D - or expanded - JLREQ appendix E) to
achieve justification.
While each glue occurrence could be specified explicitly via markup, it
can be determined most of the time from its context, using classes: for
a left glue, by the class of what's on the left of the grapheme cluster
occurrence and by the class of the grapheme cluster occurrence itself;
and similarly for a right glue, by the class of the grapheme cluster
occurrence and by the class of what's on the right of the grapheme
cluster occurrence.
What's on the left (or right) of a grapheme cluster occurrence may be
another grapheme cluster occurrence, in which case the class of "what's
on the left" is the class of that other grapheme cluster occurrence. But
it can also be that there is no other grapheme cluster occurrence on the
left, or there is some intervening graphical element, thus leading to
classes:
- the beginning (or end) of a paragraph
- the beginning (or end) of a line
- a different bidi level (the purpose of this class is to avoid
involving the bidi reordering when measuring lines)
- the inside of a box with non-zero margin, border or padding
- the outside of such a box
- an inline object (e.g. image)
- a TCY element
- the outside or inside of a warichu element
The class of a grapheme cluster occurrence could also be specified
explicitly by markup, but it can often be determined from the characters
composing the grapheme cluster occurrence (at which point, it is the
same for all occurrences of a given grapheme cluster). That can in turn
be determined from classes assigned to the characters in the grapheme
cluster. Generally, the base character of a grapheme cluster determines
the class of the grapheme cluster, but there are cases where the other
characters "dominate" the determination: for example, <U+00A0 NO-BREAK
SPACE> may be in a class, and <U+00A0 U+0301 COMBINING ACUTE> may be in
a different class.
Finally, we arrive at the classes of characters. Below is a proposed
assignment for the whole Unicode repertoire. This classification mostly
aligns with that of JLREQ, with a few differences:
- for unassigned code points (in the Unicode sense), the class is a
prediction based on the likely future allocation of those code points
- JLREQ simply ignores the existence of the full width characters at
U+FFxx. This leads to a number of "ambiguous" characters, such as U+0041
LATIN CAPITAL LETTER A, where JLREQ says both "an occurrence of U+0041
could be in the Western class" (A.27) and "an occurrence of U+0041 could
be in the Ideographic class" (A.19). In practice, authors routinely use
U+0041 and U+FF21 precisely to disambiguate the class to use.
- it distinguishes the class used in horizontal and in vertical texts
- it distinguishes the inseparables (see below)
- it uses the InDesign refinement of the opening and closing classes
(square, rounded, other)
The proposed assignment also mentions the UAX50 vertical orientation
property, as it is closed aligned and informs the spacing class assignment.
---
Ambiguous characters
While most characters are unambiguously in a class, regardless of their
context, a few characters common in Japanese typography are inherently
ambiguous:
U+2018 ‘ LEFT SINGLE QUOTATION MARK
U+201C “ LEFT DOUBLE QUOTATION MARK
U+00AB « LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
U+2019 ’ RIGHT SINGLE QUOTATION MARK
U+201D ” RIGHT DOUBLE QUOTATION MARK
U+00BB » RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
U+2010 ‐ HYPHEN
U+2013 – EN DASH
U+203C ‼ DOUBLE EXCLAMATION MARK
U+2047 ⁇ DOUBLE QUESTION MARK
U+2028 ⁈ QUESTION EXCLAMATION MARK
U+2049 ⁉ EXCLAMATION QUESTION MARK
U+00B7 · MIDDLE DOT
U+2022 • BULLET
U+2014 — EM DASH
U+2026 … HORIZONTAL ELLIPSIS
U+2025 ‥ TWO DOT LEADER
A possibility is to resolve those based on the locale, or their resolved
script (itself determined by looking at the script of the adjacent
character).
The locale method has the downside that authors are not always tagging
their text appropriately (either not at all, or not carefully on
punctuation).
The script method has the advantage of not requiring the author's help,
and that computation is already necessary in OpenType layout engines.
---
Inseparables
Currently, all inseparables are lumped in a single class, and a footnote
explains that the behavior inseparable/inseparable applies only to two
occurrences of the same inseparable. It would be better to have separate
classes for inseperables. Not only does that avoid a footnote, but it
also means that one can specify different glues for e.g.
ideographic/inseparable_emDash and ideographic/inseparable_twoDotLeader,
or specify different glues for
inseparable_emDash/inseparable_twoDotLeader and
inseparable_emDash/inseparable_ellipsis.
---
Logical vs visual order:
It should be made clear that the practical definition of glues is in the
visual space: that's why we used the terms "left" and "right".
---
Classes as a Unicode property
From a practical point of view, I believe that the spacing class should
be part of the Unicode Character Database, as a property, just like the
vertical orientation property. The main reason is that this is the most
reliable way to get a something well defined (in the sense of having a
definition, not necessarily in the sense of having correct values), and
in sync with the Unicode repertoire. It is a relatively easy task for
Unicode, as has been demonstrated with the vertical orientation
property. (In fact, the very first draft of what because UAX50 included
the spacing class).
It is worth nothing that such a Unicode property is only a starting
point. As noted earlier, markup should always be available to influence
the determination of the glue. Thus there is no need for such a Unicode
property to be perfect; it does however need to be easily accessible and
fairly stable.
========
Classes and glue settings
The classes are only one part of the final visual appearance: the glue
settings also come into play, so it is worth discussing those a bit, as
they may influence the design of the classes.
---
Glue settings and justification
When justifying text (a common case for body text), implementation may
have to expand a glue to an arbitrary width. Consider for example a two
character paragraph, with text-align-last: justify, the glue has to be
(linewidth - 2em). While large glues are sometime the result of
pathological conditions, they can also be explicitly intended, such as
in jidori processing. Thus it is desirable to allow pretty much all
glues to grow to indefinitely.
---
Glue settings are mostly for body text
JLREQ currently describes three glue settings (default, JIS, and book,
in tables 3-5 of appendix D; they differ only on the behavior when
compressing lines, but in principle different settings could also differ
on natural width or when expanding lines). It seems that those setting
are mostly concerned with body text, and are not appropriate for, e.g.,
titles. For example, the default method specifies 0 glue between
paragraph (line) start and an opening bracket, and 0.5em between a
closing bracket and a paragraph (line) end; for a title starting and
ending with brackets, which happens to be set on two lines (centered and
not justified), this assymetric can be jarring.
It would be worth having a discussion that the settings apply to body
text and to mention when they are not appropriate, or even better to
include setting for other other situations. The most important situation
that come to mind: titles, and ruby base/ruby text.
---
Interchange of glue settings
The discussion so far has been about determining the classes from
characters, leaving room for document styling systems (e.g. CSS) to let
authors explicitly specify classes of occurrences. The classification is
of course only one part of the final result, the other being the glues
that result from those classes (i.e. JLREQ appendices B, D, E). It would
be useful to encourage document styling systems to allow the
specification the glues as well, in the documents, either in the form of
selecting from a predetermined set of settings, or by completely
specifying the settings (may be as delta on top of the predetermined
settings).
---
Spacing classes and the CSS text-indent property.
With the model presented above, the CSS text-indent property is
essentially an unconditional, invariable glue between to the left of the
first grapheme in a paragraph. In practice, it is useful in Japanese
typography to make that glue at least conditional: e.g. 1em before an
ideograph, and 0.5em before an opening bracket. I think the best way
forward is to recommend that for paragraphs using the spacing model
discussed here, that glue be controlled by the spacing model (i.e. the
mojikumi tables) and that text-indent be set to 0.
========
Columns:
- code point
- UAX50 vertical orientation
- H: the class for horizontal text is in column 5
blank: the class for horizontal text is ideographic
- V: the class for vertical text is in column 5
blank: the class for vertical text is ideographic
- class
- A: if the resolved script is not Hans, Hant, Jpan -> westernChar
0x000000 | R | H | V | unknown
0x000009 | R | H | V | tab
0x00000A | R | H | V | lineEdge
0x00000B | R | H | V | unknown
0x00000D | R | H | V | lineEdge
0x00000E | R | H | V | unknown
0x000020 | R | H | justifyingSpace
0x000021 | R | H | westernChar
0x000080 | R | H | V | unknown
0x000085 | R | H | V | lineEdge
0x000086 | R | H | V | unknown
0x0000A0 | R | H | justifyingSpace
0x0000A1 | R | H | westernChar
0x0000A7 | U | H | westernChar
0x0000A8 | R | H | westernChar
0x0000A9 | U | H | westernChar
0x0000AA | R | H | westernChar
0x0000AB | R | H | V | openingBracket_other
0x0000AC | R | H | westernChar
0x0000AD | R | H | V | unknown
0x0000AE | U | H | westernChar
0x0000AF | R | H | westernChar
0x0000B0 | R | H | postfixedAbbrev
0x0000B1 | U | H | westernChar
0x0000B2 | R | H | westernChar
0x0000BB | R | H | V | closingBracket_other
0x0000BC | U | H | westernChar
0x0000BF | R | H | westernChar
0x0000D7 | U | H | westernChar
0x0000D8 | R | H | westernChar
0x0000F7 | U | H | westernChar
0x0000F8 | R | H | westernChar
0x0002EA | U | H | V | ideographic
0x0002EC | R | H | westernChar
0x001100 | U | H | V | ideographic
0x001200 | R | H | westernChar
0x001401 | U | H | westernChar
0x001680 | R | H | westernChar
0x0018B0 | U | H | westernChar
0x001900 | R | H | westernChar
0x00200B | R | H | V | transparent
0x00200D | R | H | V | unknown
0x002010 | R | H | V | hyphen_middlePunctuation
0x002014 | R | H | inseparable_emDash
0x002016 | U | H | westernChar
0x002017 | R | H | westernChar
0x002018 | R | H | V | openingBracket_other
0x002019 | R | H | V | closingBracket_other | A
0x00201A | R | H | westernChar
0x00201C | R | H | V | openingBracket_other
0x00201D | R | H | V | closingBracket_other
0x00201E | R | H | westernChar
0x002020 | U | H | westernChar
0x002022 | R | H | westernChar
0x002025 | R | H | inseparable_twoDotLeader
0x002026 | R | H | inseparable_ellipsis
0x002027 | R | H | westernChar
0x002028 | R | H | V | lineEdge
0x00202A | R | H | V | unknown
0x00202F | R | H | westernChar
0x002030 | U | H | V | postfixedAbbrev
0x002032 | R | H | V | postfixedAbbrev
0x002034 | R | H | westernChar
0x00203B | U | H | V | ideographic
0x00203C | U | H | V | dividingPunctuation
0x00203D | R | H | westernChar
0x002042 | U | H | westernChar
0x002043 | R | H | westernChar
0x002047 | U | H | V | dividingPunctuation
0x00204A | R | H | westernChar
0x002051 | U | H | westernChar
0x002052 | R | H | westernChar
0x00205F | R | H | westernChar
0x002060 | R | H | V | unknown
0x002065 | U | H | V | ideographic
0x002066 | R | H | V | unknown
0x002070 | R | H | westernChar
0x0020AC | R | H | V | prefixedAbbrev
0x0020AD | R | H | westernChar
0x0020DD | U | H | westernChar
0x0020E1 | R | H | westernChar
0x0020E2 | U | H | westernChar
0x0020E5 | R | H | westernChar
0x002100 | U | H | V | ideographic
0x002102 | R | H | westernChar
0x002103 | U | H | V | postfixedAbbrev
0x002104 | U | H | V | ideographic
0x002109 | U | H | V | postfixedAbbrev
0x00210A | R | H | westernChar
0x00210F | U | H | V | ideographic
0x002110 | R | H | westernChar
0x002113 | U | H | V | postfixedAbbrev
0x002114 | U | H | V | ideographic
0x002115 | R | H | westernChar
0x002116 | U | H | V | prefixedAbbrev
0x002117 | U | H | V | ideographic
0x002118 | R | H | westernChar
0x00211E | U | H | V | ideographic
0x002124 | R | H | westernChar
0x002125 | U | H | V | ideographic
0x002126 | R | H | westernChar
0x002127 | U | H | V | ideographic
0x002128 | R | H | westernChar
0x002129 | U | H | V | ideographic
0x00212A | R | H | westernChar
0x00212E | U | H | V | ideographic
0x00212F | R | H | westernChar
0x002135 | U | H | V | ideographic
0x002140 | R | H | westernChar
0x002145 | U | H | V | ideographic
0x00214B | R | H | westernChar
0x00214C | U | H | V | ideographic
0x00214E | R | H | westernChar
0x00214F | U | H | V | ideographic
0x00218A | R | H | westernChar
0x00218C | U | H | V | ideographic
0x002190 | R | H | V | ideographic
0x00221E | U | H | V | ideographic
0x00221F | R | H | V | ideographic
0x002234 | U | H | V | ideographic
0x002236 | R | H | V | ideographic
0x002300 | U | H | V | ideographic
0x002308 | R | H | V | ideographic
0x00230C | U | H | V | ideographic
0x002320 | R | H | V | ideographic
0x002324 | U | H | V | ideographic
0x002329 | Tr | H | V | openingBracket_other
0x00232A | Tr | H | V | closingBracket_other
0x00232B | U | H | V | ideographic
0x00232C | R | H | V | ideographic
0x00237D | U | H | V | ideographic
0x00239B | R | H | V | ideographic
0x0023BE | U | H | V | ideographic
0x0023CE | R | H | V | ideographic
0x0023CF | U | H | V | ideographic
0x0023D0 | R | H | V | ideographic
0x0023D1 | U | H | V | ideographic
0x0023DC | R | H | V | ideographic
0x0023E2 | U | H | V | ideographic
0x002423 | R | H | westernChar
0x002424 | U | H | V | ideographic
0x002500 | R | H | inseparable_emDash
0x002580 | R | H | westernChar
0x0025A0 | U | H | V | ideographic
0x00261A | R | H | V | ideographic
0x002620 | U | H | V | ideographic
0x002768 | R | H | westernChar
0x002776 | U | H | V | ideographic
0x002794 | R | H | V | ideographic
0x002800 | R | H | westernChar
0x002900 | R | H | V | ideographic
0x002B12 | U | H | V | ideographic
0x002B30 | R | H | V | ideographic
0x002B50 | U | H | V | ideographic
0x002B5A | R | H | V | ideographic
0x002BB8 | U | H | V | ideographic
0x002BD2 | R | H | V | ideographic
0x002BD3 | U | H | V | ideographic
0x002BEC | R | H | V | ideographic
0x002BF0 | U | H | V | ideographic
0x002C00 | R | H | westernChar
0x002E80 | U | H | V | ideographic
0x003000 | U | H | V | fullSpace
0x003001 | Tu | H | V | comma_ideo
0x003002 | Tu | H | V | fullStop_ideo
0x003003 | U | H | V | ideographic
0x003005 | U | H | V | iterationMark
0x003006 | U | H | V | ideographic
0x003008 | Tr | H | V | openingBracket_other
0x003009 | Tr | H | V | closingBracket_other
0x00300A | Tr | H | V | openingBracket_other
0x00300B | Tr | H | V | closingBracket_other
0x00300C | Tr | H | V | openingBracket_corner
0x00300D | Tr | H | V | closingBracket_corner
0x00300E | Tr | H | V | openingBracket_corner
0x00300F | Tr | H | V | closingBracket_corner
0x003010 | Tr | H | V | openingBracket_other
0x003011 | Tr | H | V | closingBracket_other
0x003012 | U | H | V | ideographic
0x003014 | Tr | H | V | openingBracket_other
0x003015 | Tr | H | V | closingBracket_other
0x003016 | Tr | H | V | openingBracket_other
0x003017 | Tr | H | V | closingBracket_other
0x003018 | Tr | H | V | openingBracket_other
0x003019 | Tr | H | V | closingBracket_other
0x00301A | Tr | H | V | openingBracket_corner
0x00301B | Tr | H | V | closingBracket_corner
0x00301C | Tr | H | V | hyphen_other
0x00301D | Tr | H | V | openingBracket_other
0x00301E | Tr | H | V | closingBracket_other
0x003020 | U | H | V | ideographic
0x003030 | Tr | H | V | ideographic
0x003031 | U | H | V | ideographic
0x003033 | U | H | V | inseparable_repeatUpper
0x003034 | U | H | V | inseparable_repeatVoiceUpper
0x003035 | U | H | V | inseparable_repeatLower
0x003036 | U | H | V | ideographic
0x00303B | U | H | V | iterationMark
0x00303C | U | H | V | ideographic
0x003040 | U | H | V | hiragana
0x003041 | Tu | H | V | smallKana
0x003042 | U | H | V | hiragana
0x003043 | Tu | H | V | smallKana
0x003044 | U | H | V | hiragana
0x003045 | Tu | H | V | smallKana
0x003046 | U | H | V | hiragana
0x003047 | Tu | H | V | smallKana
0x003048 | U | H | V | hiragana
0x003049 | Tu | H | V | smallKana
0x00304A | U | H | V | hiragana
0x003063 | Tu | H | V | smallKana
0x003064 | U | H | V | hiragana
0x003083 | Tu | H | V | smallKana
0x003084 | U | H | V | hiragana
0x003085 | Tu | H | V | smallKana
0x003086 | U | H | V | hiragana
0x003087 | Tu | H | V | smallKana
0x003088 | U | H | V | hiragana
0x00308E | Tu | H | V | smallKana
0x00308F | U | H | V | hiragana
0x003095 | Tu | H | V | smallKana
0x003097 | U | H | V | hiragana
0x00309B | Tu | H | V | hiragana
0x00309D | U | H | V | iterationMark
0x00309F | U | H | V | hiragana
0x0030A0 | Tr | H | V | hyphen_katakana
0x0030A1 | Tu | H | V | smallKana
0x0030A2 | U | H | V | katakana
0x0030A3 | Tu | H | V | smallKana
0x0030A4 | U | H | V | katakana
0x0030A5 | Tu | H | V | smallKana
0x0030A6 | U | H | V | katakana
0x0030A7 | Tu | H | V | smallKana
0x0030A8 | U | H | V | katakana
0x0030A9 | Tu | H | V | smallKana
0x0030AA | U | H | V | katakana
0x0030C3 | Tu | H | V | smallKana
0x0030C4 | U | H | V | katakana
0x0030E3 | Tu | H | V | smallKana
0x0030E4 | U | H | V | katakana
0x0030E5 | Tu | H | V | smallKana
0x0030E6 | U | H | V | katakana
0x0030E7 | Tu | H | V | smallKana
0x0030E8 | U | H | V | katakana
0x0030EE | Tu | H | V | smallKana
0x0030EF | U | H | V | katakana
0x0030F5 | Tu | H | V | smallKana
0x0030F7 | U | H | V | katakana
0x0030FB | U | H | V | middleDot_middlePunctuation
0x0030FC | Tr | H | V | prolongedSoundMark
0x0030FD | U | H | V | iterationMark
0x0030FF | U | H | V | katakana
0x003100 | U | H | V | ideographic
0x003127 | Tu | H | V | ideographic
0x003128 | U | H | V | ideographic
0x0031F0 | Tu | H | V | smallKana
0x003200 | U | H | V | ideographic
0x003300 | Tu | H | V | ideographic
0x003303 | Tu | H | V | postfixedAbbrev
0x003304 | Tu | H | V | ideographic
0x00330D | Tu | H | V | postfixedAbbrev
0x00330E | Tu | H | V | ideographic
0x003314 | Tu | H | V | postfixedAbbrev
0x003315 | Tu | H | V | ideographic
0x003318 | Tu | H | V | postfixedAbbrev
0x003319 | Tu | H | V | ideographic
0x003322 | Tu | H | V | postfixedAbbrev
0x003324 | Tu | H | V | ideographic
0x003326 | Tu | H | V | postfixedAbbrev
0x003328 | Tu | H | V | ideographic
0x00332B | Tu | H | V | postfixedAbbrev
0x00332C | Tu | H | V | ideographic
0x003336 | Tu | H | V | postfixedAbbrev
0x003337 | Tu | H | V | ideographic
0x00333B | Tu | H | V | postfixedAbbrev
0x00333C | Tu | H | V | ideographic
0x003349 | Tu | H | V | postfixedAbbrev
0x00334B | Tu | H | V | ideographic
0x00334D | Tu | H | V | postfixedAbbrev
0x00334E | Tu | H | V | ideographic
0x003351 | Tu | H | V | postfixedAbbrev
0x003352 | Tu | H | V | ideographic
0x003357 | Tu | H | V | postfixedAbbrev
0x003358 | U | H | V | ideographic
0x003371 | U | H | V | postfixedAbbrev
0x00337B | Tu | H | V | ideographic
0x003380 | U | H | V | postfixedAbbrev
0x0033E0 | U | H | V | ideographic
0x00A4D0 | R | H | westernChar
0x00A960 | U | H | V | ideographic
0x00A980 | R | H | westernChar
0x00AC00 | U | H | V | ideographic
0x00D800 | R | H | westernChar
0x00E000 | U | H | V | ideographic
0x00FB00 | R | H | westernChar
0x00FE10 | U | H | V | ideographic
0x00FE17 | U | H | V | openingBracket_other
0x00FE18 | U | H | V | closingBracket_other
0x00FE19 | U | H | V | ideographic
0x00FE20 | R | H | westernChar
0x00FE30 | U | H | V | inseparable_twoDotLeaderV
0x00FE31 | U | H | V | inseparable_emDashV
0x00FE32 | U | H | V | hyphen_middlePunctuation
0x00FE33 | U | H | V | ideographic
0x00FE35 | U | H | V | openingBracket_round
0x00FE36 | U | H | V | closingBracket_round
0x00FE37 | U | H | V | openingBracket_other
0x00FE38 | U | H | V | closingBracket_other
0x00FE39 | U | H | V | openingBracket_other
0x00FE3A | U | H | V | closingBracket_other
0x00FE3B | U | H | V | openingBracket_other
0x00FE3C | U | H | V | closingBracket_other
0x00FE3D | U | H | V | openingBracket_other
0x00FE3E | U | H | V | closingBracket_other
0x00FE3F | U | H | V | openingBracket_other
0x00FE40 | U | H | V | closingBracket_other
0x00FE41 | U | H | V | openingBracket_corner
0x00FE42 | U | H | V | closingBracket_corner
0x00FE43 | U | H | V | openingBracket_corner
0x00FE44 | U | H | V | closingBracket_corner
0x00FE45 | U | H | V | ideographic
0x00FE47 | U | H | V | openingBracket_other
0x00FE48 | U | H | V | closingBracket_other
0x00FE49 | R | H | westernChar
0x00FE50 | Tu | H | V | ideographic
0x00FE53 | U | H | V | ideographic
0x00FE58 | R | H | V | ideographic
0x00FE59 | Tr | H | V | ideographic
0x00FE5F | U | H | V | ideographic
0x00FE63 | R | H | V | ideographic
0x00FE67 | U | H | V | ideographic
0x00FE70 | R | H | westernChar
0x00FEFF | R | H | V | unknown
0x00FF00 | R | H | westernChar
0x00FF01 | Tu | H | V | dividingPunctuation
0x00FF02 | U | H | V | ideographic
0x00FF03 | U | H | V | prefixedAbbrev
0x00FF05 | U | H | V | postfixedAbbrev
0x00FF06 | U | H | V | ideographic
0x00FF08 | Tr | H | V | openingBracket_round
0x00FF09 | Tr | H | V | closingBracket_round
0x00FF0A | U | H | V | ideographic
0x00FF0C | Tu | H | V | comma_western
0x00FF0D | R | H | V | ideographic
0x00FF0E | Tu | H | V | fullStop_western
0x00FF0F | U | H | V | ideographic
0x00FF1A | Tr | H | V | middleDot_colon
0x00FF1C | R | H | V | ideographic
0x00FF1F | Tu | H | V | dividingPunctuation
0x00FF20 | U | H | V | ideographic
0x00FF3B | Tr | H | V | openingBracket_other
0x00FF3C | U | H | V | ideographic
0x00FF3D | Tr | H | V | closingBracket_other
0x00FF3E | U | H | V | ideographic
0x00FF3F | Tr | H | V | ideographic
0x00FF40 | U | H | V | ideographic
0x00FF5B | Tr | H | V | openingBracket_other
0x00FF5C | Tr | H | V | ideographic
0x00FF5D | Tr | H | V | closingBracket_other
0x00FF5E | Tr | H | V | ideographic
0x00FF5F | Tr | H | V | openingBracket_round
0x00FF60 | Tr | H | V | closingBracket_round
0x00FF61 | R | H | westernChar
0x00FFE0 | U | H | V | postfixedAbbrev
0x00FFE1 | U | H | V | prefixedAbbrev
0x00FFE2 | U | H | V | ideographic
0x00FFE3 | Tr | H | V | ideographic
0x00FFE4 | U | H | V | ideographic
0x00FFE5 | U | H | V | prefixedAbbrev
0x00FFE6 | U | H | V | ideographic
0x00FFE8 | R | H | westernChar
0x00FFF0 | U | H | V | ideographic
0x00FFF9 | R | H | V | transparent
0x00FFFC | U | H | V | inlineObject
0x00FFFD | U | H | V | ideographic
0x00FFFE | R | H | V | unknown
0x010000 | R | H | westernChar
0x010980 | U | H | westernChar
0x0109A0 | R | H | westernChar
0x011580 | U | H | westernChar
0x011600 | R | H | westernChar
0x011A00 | U | H | V | ideographic
0x011AB0 | R | H | westernChar
0x013000 | U | H | westernChar
0x013430 | R | H | westernChar
0x014400 | U | H | westernChar
0x014680 | R | H | westernChar
0x016FE0 | U | H | V | ideographic
0x018B00 | R | H | westernChar
0x01B000 | U | H | V | katakana
0x01B001 | U | H | V | hiragana
0x01B130 | R | H | westernChar
0x01B170 | U | H | V | ideographic
0x01B300 | R | H | westernChar
0x01D000 | U | H | westernChar
0x01D200 | R | H | westernChar
0x01D2E0 | U | H | V | ideographic
0x01D300 | U | H | westernChar
0x01D380 | R | H | westernChar
0x01D800 | U | H | westernChar
0x01DAB0 | R | H | westernChar
0x01F000 | U | H | V | ideographic
0x01F200 | Tu | H | V | ideographic
0x01F202 | U | H | V | ideographic
0x01F800 | R | H | V | ideographic
0x01F900 | U | H | V | ideographic
0x01FA70 | R | H | westernChar
0x020000 | U | H | V | ideographic
0x02FFFE | R | H | V | unknown
0x030000 | U | H | V | ideographic
0x03FFFE | R | H | V | unknown
0x040000 | R | H | westernChar
0x0F0000 | U | H | V | ideographic
0x0FFFFE | R | H | V | unknown
0x100000 | U | H | V | ideographic
0x10FFFE | R | H | V | unknown
0x110000
===========
Received on Sunday, 18 October 2020 23:23:06 UTC