Re: online meeting to re: Reorganizing JLReq character class

On 10/14/20 1:32 AM, 木田泰夫 wrote:
> sorry English speakers but I would appreciate it if you could send 
> your comments on the email list

Here is our perspective as implementers. It is a bit raw (sorry, we 
noticed the announcement a bit late), don't hesitate to reach out for 
clarification.

Eric.

---

Character classes serve two purposes: linebreak opportunities and 
spacing around characters.

Linebreak opportunities are adequately handled by Unicode currently, at 
most needing some adjustment in UAX14 or in the CLDR language 
tailorings. Therefore that use is not discussed here.

---

A possible spacing model is that there is glue (variable space) on each 
side of each grapheme cluster occurrence. This glue is characterized by 
its natural width (JLREQ appendix B) and can be deformed (either 
compressed - JLREQ appendix D - or expanded - JLREQ appendix E) to 
achieve justification.

While each glue occurrence could be specified explicitly via markup, it 
can be determined most of the time from its context, using classes: for 
a left glue, by the class of what's on the left of the grapheme cluster 
occurrence and by the class of the grapheme cluster occurrence itself; 
and similarly for a right glue, by the class of the grapheme cluster 
occurrence and by the class of what's on the right of the grapheme 
cluster occurrence.

What's on the left (or right) of a grapheme cluster occurrence may be 
another grapheme cluster occurrence, in which case the class of "what's 
on the left" is the class of that other grapheme cluster occurrence. But 
it can also be that there is no other grapheme cluster occurrence on the 
left, or there is some intervening graphical element, thus leading to 
classes:

- the beginning (or end) of a paragraph
- the beginning (or end) of a line
- a different bidi level (the purpose of this class is to avoid 
involving the bidi reordering when measuring lines)
- the inside of a box with non-zero margin, border or padding
- the outside of such a box
- an inline object (e.g. image)
- a TCY element
- the outside or inside of a warichu element


The class of a grapheme cluster occurrence could also be specified 
explicitly by markup, but it can often be determined from the characters 
composing the grapheme cluster occurrence (at which point, it is the 
same for all occurrences of a given grapheme cluster). That can in turn 
be determined from classes assigned to the characters in the grapheme 
cluster. Generally, the base character of a grapheme cluster determines 
the class of the grapheme cluster, but there are cases where the other 
characters "dominate" the determination: for example, <U+00A0 NO-BREAK 
SPACE> may be in a class, and <U+00A0 U+0301 COMBINING ACUTE> may be in 
a different class.

Finally, we arrive at the classes of characters. Below is a proposed 
assignment for the whole Unicode repertoire. This classification mostly 
aligns with that of JLREQ, with a few differences:

- for unassigned code points (in the Unicode sense), the class is a 
prediction based on the likely future allocation of those code points

- JLREQ simply ignores the existence of the full width characters at 
U+FFxx. This leads to a number of "ambiguous" characters, such as U+0041 
LATIN CAPITAL LETTER A, where JLREQ says both "an occurrence of U+0041 
could be in the Western class" (A.27) and "an occurrence of U+0041 could 
be in the Ideographic class" (A.19). In practice, authors routinely use 
U+0041 and U+FF21 precisely to disambiguate the class to use.

- it distinguishes the class used in horizontal and in vertical texts

- it distinguishes the inseparables (see below)

- it uses the InDesign refinement of the opening and closing classes 
(square, rounded, other)

The proposed assignment also mentions the UAX50 vertical orientation 
property, as it is closed aligned and informs the spacing class assignment.

---
Ambiguous characters

While most characters are unambiguously in a class, regardless of their 
context, a few characters common in Japanese typography are inherently 
ambiguous:


     U+2018 ‘ LEFT SINGLE QUOTATION MARK
     U+201C “ LEFT DOUBLE QUOTATION MARK
     U+00AB « LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
     U+2019 ’ RIGHT SINGLE QUOTATION MARK
     U+201D ” RIGHT DOUBLE QUOTATION MARK
     U+00BB » RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
     U+2010 ‐ HYPHEN
     U+2013 – EN DASH
     U+203C ‼ DOUBLE EXCLAMATION MARK
     U+2047 ⁇ DOUBLE QUESTION MARK
     U+2028 ⁈ QUESTION EXCLAMATION MARK
     U+2049 ⁉ EXCLAMATION QUESTION MARK
     U+00B7 · MIDDLE DOT
     U+2022 • BULLET
     U+2014 — EM DASH
     U+2026 … HORIZONTAL ELLIPSIS
     U+2025 ‥ TWO DOT LEADER

A possibility is to resolve those based on the locale, or their resolved 
script (itself determined by looking at the script of the adjacent 
character).

The locale method has the downside that authors are not always tagging 
their text appropriately (either not at all, or not carefully on 
punctuation).

The script method has the advantage of not requiring the author's help, 
and that computation is already necessary in OpenType layout engines.

---
Inseparables

Currently, all inseparables are lumped in a single class, and a footnote 
explains that the behavior inseparable/inseparable applies only to two 
occurrences of the same inseparable. It would be better to have separate 
classes for inseperables. Not only does that avoid a footnote, but it 
also means that one can specify different glues for e.g. 
ideographic/inseparable_emDash and ideographic/inseparable_twoDotLeader, 
or specify different glues for 
inseparable_emDash/inseparable_twoDotLeader and 
inseparable_emDash/inseparable_ellipsis.

---
Logical vs visual order:

It should be made clear that the practical definition of glues is in the 
visual space: that's why we used the terms "left" and "right".

---
Classes as a Unicode property

 From a practical point of view, I believe that the spacing class should 
be part of the Unicode Character Database, as a property, just like the 
vertical orientation property. The main reason is that this is the most 
reliable way to get a something well defined (in the sense of having a 
definition, not necessarily in the sense of having correct values), and 
in sync with the Unicode repertoire. It is a relatively easy task for 
Unicode, as has been demonstrated with the vertical orientation 
property. (In fact, the very first draft of what because UAX50 included 
the spacing class).

It is worth nothing that such a Unicode property is only a starting 
point. As noted earlier, markup should always be available to influence 
the determination of the glue. Thus there is no need for such a Unicode 
property to be perfect; it does however need to be easily accessible and 
fairly stable.

========
Classes and glue settings

The classes are only one part of the final visual appearance: the glue 
settings also come into play, so it is worth discussing those a bit, as 
they may influence the design of the classes.

---
Glue settings and justification

When justifying text (a common case for body text), implementation may 
have to expand a glue to an arbitrary width. Consider for example a two 
character paragraph, with text-align-last: justify, the glue has to be 
(linewidth - 2em). While large glues are sometime the result of 
pathological conditions, they can also be explicitly intended, such as 
in jidori processing. Thus it is desirable to allow pretty much all 
glues to grow to indefinitely.

---
Glue settings are mostly for body text

JLREQ currently describes three glue settings (default, JIS, and book, 
in tables 3-5 of appendix D; they differ only on the behavior when 
compressing lines, but in principle different settings could also differ 
on natural width or when expanding lines). It seems that those setting 
are mostly concerned with body text, and are not appropriate for, e.g., 
titles. For example, the default method specifies 0 glue between 
paragraph (line) start and an opening bracket, and 0.5em between a 
closing bracket and a paragraph (line) end; for a title starting and 
ending with brackets, which happens to be set on two lines (centered and 
not justified), this assymetric can be jarring.

It would be worth having a discussion that the settings apply to body 
text and to mention when they are not appropriate, or even better to 
include setting for other other situations. The most important situation 
that come to mind: titles, and ruby base/ruby text.

---
Interchange of glue settings

The discussion so far has been about determining the classes from 
characters, leaving room for document styling systems (e.g. CSS) to let 
authors explicitly specify classes of occurrences. The classification is 
of course only one part of the final result, the other being the glues 
that result from those classes (i.e. JLREQ appendices B, D, E). It would 
be useful to encourage document styling systems to allow the 
specification the glues as well, in the documents, either in the form of 
selecting from a predetermined set of settings, or by completely 
specifying the settings (may be as delta on top of the predetermined 
settings).

---
Spacing classes and the CSS text-indent property.

With the model presented above, the CSS text-indent property is 
essentially an unconditional, invariable glue between to the left of the 
first grapheme in a paragraph. In practice, it is useful in Japanese 
typography to make that glue at least conditional: e.g. 1em before an 
ideograph, and 0.5em before an opening bracket. I think the best way 
forward is to recommend that for paragraphs using the spacing model 
discussed here, that glue be controlled by the spacing model (i.e. the 
mojikumi tables) and that text-indent be set to 0.

========

Columns:

   - code point
   - UAX50 vertical orientation
   - H:     the class for horizontal text is in column 5
     blank: the class for horizontal text is ideographic
   - V:     the class for vertical text is in column 5
     blank: the class for vertical text is ideographic
   - class
   - A:     if the resolved script is not Hans, Hant, Jpan -> westernChar


         0x000000 | R  | H | V | unknown
         0x000009 | R  | H | V | tab
         0x00000A | R  | H | V | lineEdge
         0x00000B | R  | H | V | unknown
         0x00000D | R  | H | V | lineEdge
         0x00000E | R  | H | V | unknown
         0x000020 | R  | H     | justifyingSpace
         0x000021 | R  | H     | westernChar
         0x000080 | R  | H | V | unknown
         0x000085 | R  | H | V | lineEdge
         0x000086 | R  | H | V | unknown
         0x0000A0 | R  | H     | justifyingSpace
         0x0000A1 | R  | H     | westernChar
         0x0000A7 | U  | H     | westernChar
         0x0000A8 | R  | H     | westernChar
         0x0000A9 | U  | H     | westernChar
         0x0000AA | R  | H     | westernChar
         0x0000AB | R  | H | V | openingBracket_other
         0x0000AC | R  | H     | westernChar
         0x0000AD | R  | H | V | unknown
         0x0000AE | U  | H     | westernChar
         0x0000AF | R  | H     | westernChar
         0x0000B0 | R  | H     | postfixedAbbrev
         0x0000B1 | U  | H     | westernChar
         0x0000B2 | R  | H     | westernChar
         0x0000BB | R  | H | V | closingBracket_other
         0x0000BC | U  | H     | westernChar
         0x0000BF | R  | H     | westernChar
         0x0000D7 | U  | H     | westernChar
         0x0000D8 | R  | H     | westernChar
         0x0000F7 | U  | H     | westernChar
         0x0000F8 | R  | H     | westernChar
         0x0002EA | U  | H | V | ideographic
         0x0002EC | R  | H     | westernChar
         0x001100 | U  | H | V | ideographic
         0x001200 | R  | H     | westernChar
         0x001401 | U  | H     | westernChar
         0x001680 | R  | H     | westernChar
         0x0018B0 | U  | H     | westernChar
         0x001900 | R  | H     | westernChar
         0x00200B | R  | H | V | transparent
         0x00200D | R  | H | V | unknown
         0x002010 | R  | H | V | hyphen_middlePunctuation
         0x002014 | R  | H     | inseparable_emDash
         0x002016 | U  | H     | westernChar
         0x002017 | R  | H     | westernChar
         0x002018 | R  | H | V | openingBracket_other
         0x002019 | R  | H | V | closingBracket_other          | A
         0x00201A | R  | H     | westernChar
         0x00201C | R  | H | V | openingBracket_other
         0x00201D | R  | H | V | closingBracket_other
         0x00201E | R  | H     | westernChar
         0x002020 | U  | H     | westernChar
         0x002022 | R  | H     | westernChar
         0x002025 | R  | H     | inseparable_twoDotLeader
         0x002026 | R  | H     | inseparable_ellipsis
         0x002027 | R  | H     | westernChar
         0x002028 | R  | H | V | lineEdge
         0x00202A | R  | H | V | unknown
         0x00202F | R  | H     | westernChar
         0x002030 | U  | H | V | postfixedAbbrev
         0x002032 | R  | H | V | postfixedAbbrev
         0x002034 | R  | H     | westernChar
         0x00203B | U  | H | V | ideographic
         0x00203C | U  | H | V | dividingPunctuation
         0x00203D | R  | H     | westernChar
         0x002042 | U  | H     | westernChar
         0x002043 | R  | H     | westernChar
         0x002047 | U  | H | V | dividingPunctuation
         0x00204A | R  | H     | westernChar
         0x002051 | U  | H     | westernChar
         0x002052 | R  | H     | westernChar
         0x00205F | R  | H     | westernChar
         0x002060 | R  | H | V | unknown
         0x002065 | U  | H | V | ideographic
         0x002066 | R  | H | V | unknown
         0x002070 | R  | H     | westernChar
         0x0020AC | R  | H | V | prefixedAbbrev
         0x0020AD | R  | H     | westernChar
         0x0020DD | U  | H     | westernChar
         0x0020E1 | R  | H     | westernChar
         0x0020E2 | U  | H     | westernChar
         0x0020E5 | R  | H     | westernChar
         0x002100 | U  | H | V | ideographic
         0x002102 | R  | H     | westernChar
         0x002103 | U  | H | V | postfixedAbbrev
         0x002104 | U  | H | V | ideographic
         0x002109 | U  | H | V | postfixedAbbrev
         0x00210A | R  | H     | westernChar
         0x00210F | U  | H | V | ideographic
         0x002110 | R  | H     | westernChar
         0x002113 | U  | H | V | postfixedAbbrev
         0x002114 | U  | H | V | ideographic
         0x002115 | R  | H     | westernChar
         0x002116 | U  | H | V | prefixedAbbrev
         0x002117 | U  | H | V | ideographic
         0x002118 | R  | H     | westernChar
         0x00211E | U  | H | V | ideographic
         0x002124 | R  | H     | westernChar
         0x002125 | U  | H | V | ideographic
         0x002126 | R  | H     | westernChar
         0x002127 | U  | H | V | ideographic
         0x002128 | R  | H     | westernChar
         0x002129 | U  | H | V | ideographic
         0x00212A | R  | H     | westernChar
         0x00212E | U  | H | V | ideographic
         0x00212F | R  | H     | westernChar
         0x002135 | U  | H | V | ideographic
         0x002140 | R  | H     | westernChar
         0x002145 | U  | H | V | ideographic
         0x00214B | R  | H     | westernChar
         0x00214C | U  | H | V | ideographic
         0x00214E | R  | H     | westernChar
         0x00214F | U  | H | V | ideographic
         0x00218A | R  | H     | westernChar
         0x00218C | U  | H | V | ideographic
         0x002190 | R  | H | V | ideographic
         0x00221E | U  | H | V | ideographic
         0x00221F | R  | H | V | ideographic
         0x002234 | U  | H | V | ideographic
         0x002236 | R  | H | V | ideographic
         0x002300 | U  | H | V | ideographic
         0x002308 | R  | H | V | ideographic
         0x00230C | U  | H | V | ideographic
         0x002320 | R  | H | V | ideographic
         0x002324 | U  | H | V | ideographic
         0x002329 | Tr | H | V | openingBracket_other
         0x00232A | Tr | H | V | closingBracket_other
         0x00232B | U  | H | V | ideographic
         0x00232C | R  | H | V | ideographic
         0x00237D | U  | H | V | ideographic
         0x00239B | R  | H | V | ideographic
         0x0023BE | U  | H | V | ideographic
         0x0023CE | R  | H | V | ideographic
         0x0023CF | U  | H | V | ideographic
         0x0023D0 | R  | H | V | ideographic
         0x0023D1 | U  | H | V | ideographic
         0x0023DC | R  | H | V | ideographic
         0x0023E2 | U  | H | V | ideographic
         0x002423 | R  | H     | westernChar
         0x002424 | U  | H | V | ideographic
         0x002500 | R  | H     | inseparable_emDash
         0x002580 | R  | H     | westernChar
         0x0025A0 | U  | H | V | ideographic
         0x00261A | R  | H | V | ideographic
         0x002620 | U  | H | V | ideographic
         0x002768 | R  | H     | westernChar
         0x002776 | U  | H | V | ideographic
         0x002794 | R  | H | V | ideographic
         0x002800 | R  | H     | westernChar
         0x002900 | R  | H | V | ideographic
         0x002B12 | U  | H | V | ideographic
         0x002B30 | R  | H | V | ideographic
         0x002B50 | U  | H | V | ideographic
         0x002B5A | R  | H | V | ideographic
         0x002BB8 | U  | H | V | ideographic
         0x002BD2 | R  | H | V | ideographic
         0x002BD3 | U  | H | V | ideographic
         0x002BEC | R  | H | V | ideographic
         0x002BF0 | U  | H | V | ideographic
         0x002C00 | R  | H     | westernChar
         0x002E80 | U  | H | V | ideographic
         0x003000 | U  | H | V | fullSpace
         0x003001 | Tu | H | V | comma_ideo
         0x003002 | Tu | H | V | fullStop_ideo
         0x003003 | U  | H | V | ideographic
         0x003005 | U  | H | V | iterationMark
         0x003006 | U  | H | V | ideographic
         0x003008 | Tr | H | V | openingBracket_other
         0x003009 | Tr | H | V | closingBracket_other
         0x00300A | Tr | H | V | openingBracket_other
         0x00300B | Tr | H | V | closingBracket_other
         0x00300C | Tr | H | V | openingBracket_corner
         0x00300D | Tr | H | V | closingBracket_corner
         0x00300E | Tr | H | V | openingBracket_corner
         0x00300F | Tr | H | V | closingBracket_corner
         0x003010 | Tr | H | V | openingBracket_other
         0x003011 | Tr | H | V | closingBracket_other
         0x003012 | U  | H | V | ideographic
         0x003014 | Tr | H | V | openingBracket_other
         0x003015 | Tr | H | V | closingBracket_other
         0x003016 | Tr | H | V | openingBracket_other
         0x003017 | Tr | H | V | closingBracket_other
         0x003018 | Tr | H | V | openingBracket_other
         0x003019 | Tr | H | V | closingBracket_other
         0x00301A | Tr | H | V | openingBracket_corner
         0x00301B | Tr | H | V | closingBracket_corner
         0x00301C | Tr | H | V | hyphen_other
         0x00301D | Tr | H | V | openingBracket_other
         0x00301E | Tr | H | V | closingBracket_other
         0x003020 | U  | H | V | ideographic
         0x003030 | Tr | H | V | ideographic
         0x003031 | U  | H | V | ideographic
         0x003033 | U  | H | V | inseparable_repeatUpper
         0x003034 | U  | H | V | inseparable_repeatVoiceUpper
         0x003035 | U  | H | V | inseparable_repeatLower
         0x003036 | U  | H | V | ideographic
         0x00303B | U  | H | V | iterationMark
         0x00303C | U  | H | V | ideographic
         0x003040 | U  | H | V | hiragana
         0x003041 | Tu | H | V | smallKana
         0x003042 | U  | H | V | hiragana
         0x003043 | Tu | H | V | smallKana
         0x003044 | U  | H | V | hiragana
         0x003045 | Tu | H | V | smallKana
         0x003046 | U  | H | V | hiragana
         0x003047 | Tu | H | V | smallKana
         0x003048 | U  | H | V | hiragana
         0x003049 | Tu | H | V | smallKana
         0x00304A | U  | H | V | hiragana
         0x003063 | Tu | H | V | smallKana
         0x003064 | U  | H | V | hiragana
         0x003083 | Tu | H | V | smallKana
         0x003084 | U  | H | V | hiragana
         0x003085 | Tu | H | V | smallKana
         0x003086 | U  | H | V | hiragana
         0x003087 | Tu | H | V | smallKana
         0x003088 | U  | H | V | hiragana
         0x00308E | Tu | H | V | smallKana
         0x00308F | U  | H | V | hiragana
         0x003095 | Tu | H | V | smallKana
         0x003097 | U  | H | V | hiragana
         0x00309B | Tu | H | V | hiragana
         0x00309D | U  | H | V | iterationMark
         0x00309F | U  | H | V | hiragana
         0x0030A0 | Tr | H | V | hyphen_katakana
         0x0030A1 | Tu | H | V | smallKana
         0x0030A2 | U  | H | V | katakana
         0x0030A3 | Tu | H | V | smallKana
         0x0030A4 | U  | H | V | katakana
         0x0030A5 | Tu | H | V | smallKana
         0x0030A6 | U  | H | V | katakana
         0x0030A7 | Tu | H | V | smallKana
         0x0030A8 | U  | H | V | katakana
         0x0030A9 | Tu | H | V | smallKana
         0x0030AA | U  | H | V | katakana
         0x0030C3 | Tu | H | V | smallKana
         0x0030C4 | U  | H | V | katakana
         0x0030E3 | Tu | H | V | smallKana
         0x0030E4 | U  | H | V | katakana
         0x0030E5 | Tu | H | V | smallKana
         0x0030E6 | U  | H | V | katakana
         0x0030E7 | Tu | H | V | smallKana
         0x0030E8 | U  | H | V | katakana
         0x0030EE | Tu | H | V | smallKana
         0x0030EF | U  | H | V | katakana
         0x0030F5 | Tu | H | V | smallKana
         0x0030F7 | U  | H | V | katakana
         0x0030FB | U  | H | V | middleDot_middlePunctuation
         0x0030FC | Tr | H | V | prolongedSoundMark
         0x0030FD | U  | H | V | iterationMark
         0x0030FF | U  | H | V | katakana
         0x003100 | U  | H | V | ideographic
         0x003127 | Tu | H | V | ideographic
         0x003128 | U  | H | V | ideographic
         0x0031F0 | Tu | H | V | smallKana
         0x003200 | U  | H | V | ideographic
         0x003300 | Tu | H | V | ideographic
         0x003303 | Tu | H | V | postfixedAbbrev
         0x003304 | Tu | H | V | ideographic
         0x00330D | Tu | H | V | postfixedAbbrev
         0x00330E | Tu | H | V | ideographic
         0x003314 | Tu | H | V | postfixedAbbrev
         0x003315 | Tu | H | V | ideographic
         0x003318 | Tu | H | V | postfixedAbbrev
         0x003319 | Tu | H | V | ideographic
         0x003322 | Tu | H | V | postfixedAbbrev
         0x003324 | Tu | H | V | ideographic
         0x003326 | Tu | H | V | postfixedAbbrev
         0x003328 | Tu | H | V | ideographic
         0x00332B | Tu | H | V | postfixedAbbrev
         0x00332C | Tu | H | V | ideographic
         0x003336 | Tu | H | V | postfixedAbbrev
         0x003337 | Tu | H | V | ideographic
         0x00333B | Tu | H | V | postfixedAbbrev
         0x00333C | Tu | H | V | ideographic
         0x003349 | Tu | H | V | postfixedAbbrev
         0x00334B | Tu | H | V | ideographic
         0x00334D | Tu | H | V | postfixedAbbrev
         0x00334E | Tu | H | V | ideographic
         0x003351 | Tu | H | V | postfixedAbbrev
         0x003352 | Tu | H | V | ideographic
         0x003357 | Tu | H | V | postfixedAbbrev
         0x003358 | U  | H | V | ideographic
         0x003371 | U  | H | V | postfixedAbbrev
         0x00337B | Tu | H | V | ideographic
         0x003380 | U  | H | V | postfixedAbbrev
         0x0033E0 | U  | H | V | ideographic
         0x00A4D0 | R  | H     | westernChar
         0x00A960 | U  | H | V | ideographic
         0x00A980 | R  | H     | westernChar
         0x00AC00 | U  | H | V | ideographic
         0x00D800 | R  | H     | westernChar
         0x00E000 | U  | H | V | ideographic
         0x00FB00 | R  | H     | westernChar
         0x00FE10 | U  | H | V | ideographic
         0x00FE17 | U  | H | V | openingBracket_other
         0x00FE18 | U  | H | V | closingBracket_other
         0x00FE19 | U  | H | V | ideographic
         0x00FE20 | R  | H     | westernChar
         0x00FE30 | U  | H | V | inseparable_twoDotLeaderV
         0x00FE31 | U  | H | V | inseparable_emDashV
         0x00FE32 | U  | H | V | hyphen_middlePunctuation
         0x00FE33 | U  | H | V | ideographic
         0x00FE35 | U  | H | V | openingBracket_round
         0x00FE36 | U  | H | V | closingBracket_round
         0x00FE37 | U  | H | V | openingBracket_other
         0x00FE38 | U  | H | V | closingBracket_other
         0x00FE39 | U  | H | V | openingBracket_other
         0x00FE3A | U  | H | V | closingBracket_other
         0x00FE3B | U  | H | V | openingBracket_other
         0x00FE3C | U  | H | V | closingBracket_other
         0x00FE3D | U  | H | V | openingBracket_other
         0x00FE3E | U  | H | V | closingBracket_other
         0x00FE3F | U  | H | V | openingBracket_other
         0x00FE40 | U  | H | V | closingBracket_other
         0x00FE41 | U  | H | V | openingBracket_corner
         0x00FE42 | U  | H | V | closingBracket_corner
         0x00FE43 | U  | H | V | openingBracket_corner
         0x00FE44 | U  | H | V | closingBracket_corner
         0x00FE45 | U  | H | V | ideographic
         0x00FE47 | U  | H | V | openingBracket_other
         0x00FE48 | U  | H | V | closingBracket_other
         0x00FE49 | R  | H     | westernChar
         0x00FE50 | Tu | H | V | ideographic
         0x00FE53 | U  | H | V | ideographic
         0x00FE58 | R  | H | V | ideographic
         0x00FE59 | Tr | H | V | ideographic
         0x00FE5F | U  | H | V | ideographic
         0x00FE63 | R  | H | V | ideographic
         0x00FE67 | U  | H | V | ideographic
         0x00FE70 | R  | H     | westernChar
         0x00FEFF | R  | H | V | unknown
         0x00FF00 | R  | H     | westernChar
         0x00FF01 | Tu | H | V | dividingPunctuation
         0x00FF02 | U  | H | V | ideographic
         0x00FF03 | U  | H | V | prefixedAbbrev
         0x00FF05 | U  | H | V | postfixedAbbrev
         0x00FF06 | U  | H | V | ideographic
         0x00FF08 | Tr | H | V | openingBracket_round
         0x00FF09 | Tr | H | V | closingBracket_round
         0x00FF0A | U  | H | V | ideographic
         0x00FF0C | Tu | H | V | comma_western
         0x00FF0D | R  | H | V | ideographic
         0x00FF0E | Tu | H | V | fullStop_western
         0x00FF0F | U  | H | V | ideographic
         0x00FF1A | Tr | H | V | middleDot_colon
         0x00FF1C | R  | H | V | ideographic
         0x00FF1F | Tu | H | V | dividingPunctuation
         0x00FF20 | U  | H | V | ideographic
         0x00FF3B | Tr | H | V | openingBracket_other
         0x00FF3C | U  | H | V | ideographic
         0x00FF3D | Tr | H | V | closingBracket_other
         0x00FF3E | U  | H | V | ideographic
         0x00FF3F | Tr | H | V | ideographic
         0x00FF40 | U  | H | V | ideographic
         0x00FF5B | Tr | H | V | openingBracket_other
         0x00FF5C | Tr | H | V | ideographic
         0x00FF5D | Tr | H | V | closingBracket_other
         0x00FF5E | Tr | H | V | ideographic
         0x00FF5F | Tr | H | V | openingBracket_round
         0x00FF60 | Tr | H | V | closingBracket_round
         0x00FF61 | R  | H     | westernChar
         0x00FFE0 | U  | H | V | postfixedAbbrev
         0x00FFE1 | U  | H | V | prefixedAbbrev
         0x00FFE2 | U  | H | V | ideographic
         0x00FFE3 | Tr | H | V | ideographic
         0x00FFE4 | U  | H | V | ideographic
         0x00FFE5 | U  | H | V | prefixedAbbrev
         0x00FFE6 | U  | H | V | ideographic
         0x00FFE8 | R  | H     | westernChar
         0x00FFF0 | U  | H | V | ideographic
         0x00FFF9 | R  | H | V | transparent
         0x00FFFC | U  | H | V | inlineObject
         0x00FFFD | U  | H | V | ideographic
         0x00FFFE | R  | H | V | unknown
         0x010000 | R  | H     | westernChar
         0x010980 | U  | H     | westernChar
         0x0109A0 | R  | H     | westernChar
         0x011580 | U  | H     | westernChar
         0x011600 | R  | H     | westernChar
         0x011A00 | U  | H | V | ideographic
         0x011AB0 | R  | H     | westernChar
         0x013000 | U  | H     | westernChar
         0x013430 | R  | H     | westernChar
         0x014400 | U  | H     | westernChar
         0x014680 | R  | H     | westernChar
         0x016FE0 | U  | H | V | ideographic
         0x018B00 | R  | H     | westernChar
         0x01B000 | U  | H | V | katakana
         0x01B001 | U  | H | V | hiragana
         0x01B130 | R  | H     | westernChar
         0x01B170 | U  | H | V | ideographic
         0x01B300 | R  | H     | westernChar
         0x01D000 | U  | H     | westernChar
         0x01D200 | R  | H     | westernChar
         0x01D2E0 | U  | H | V | ideographic
         0x01D300 | U  | H     | westernChar
         0x01D380 | R  | H     | westernChar
         0x01D800 | U  | H     | westernChar
         0x01DAB0 | R  | H     | westernChar
         0x01F000 | U  | H | V | ideographic
         0x01F200 | Tu | H | V | ideographic
         0x01F202 | U  | H | V | ideographic
         0x01F800 | R  | H | V | ideographic
         0x01F900 | U  | H | V | ideographic
         0x01FA70 | R  | H     | westernChar
         0x020000 | U  | H | V | ideographic
         0x02FFFE | R  | H | V | unknown
         0x030000 | U  | H | V | ideographic
         0x03FFFE | R  | H | V | unknown
         0x040000 | R  | H     | westernChar
         0x0F0000 | U  | H | V | ideographic
         0x0FFFFE | R  | H | V | unknown
         0x100000 | U  | H | V | ideographic
         0x10FFFE | R  | H | V | unknown
         0x110000

===========

Received on Sunday, 18 October 2020 23:23:06 UTC