W3C home > Mailing lists > Public > public-i18n-cjk@w3.org > October to December 2010

Re: Comments on "Requirements for Japanese Text Layout" (2)

From: KOBAYASHI Tatsuo(FAMILY Given) <tlk@kobysh.com>
Date: Thu, 23 Dec 2010 12:37:07 +0900
Message-ID: <AANLkTikkdBxJAFv2EfrqnapQ-kWHbmP+NYK-u7L2KPkM@mail.gmail.com>
To: Eric Muller <emuller@adobe.com>
Cc: public-i18n-cjk@w3.org, W3C_J_Layout <member-japanese-layout-ja@w3.org>
Eric, guys,

At last, Japanese members of JLTF disposed the second part of Eric's
useful and suggestive comments as our last job for this year.

Followings are the dispositions.
All dispositions will be reflected to the second version of
Requirements for Japanese Text Layout, and the new version will be
released soon to ask MORE NEW COMMENTS!

And, if you have any questions or comments on the dispositions, do not
hesitate to response .

Thanks again,
Tatsuo on behalf of members of JLTF Japanese team.


■[X] ■17---Accepted and will be corrected

§3.1.3, item a, note 1.

I am not sure if the term "approximate number" is know to English
speakers that learn Japanese, but it's certainly not known to
"ordinary" English speakers.
May be "number range" rather than "approximate number", together with
a translation of the text of figure 67 would help understand.

[Disposition]Accepted and the text will be modified.

[Old]
(note 1)
In vertical writing mode, ideographic digits used with IDEOGRAPHIC
COMMA "、" to represent an approximate number are expected to be set
solid too (as in the right line in [Fig.67]).
[New]
(note 1)
In vertical writing mode, ideographic digits used with IDEOGRAPHIC
COMMA "、" to represent an approximate number range are expected to be
set solid too (as in the right line in [Fig.67]).

[Rationale]
As Eric mentions, the wording "approximate number" is not
understandable for layman, however "number range" is not appropriate
for the expression, too. For example “三、四十” does not means exactly
from thirty to forty, but can be include twenty-eight or twenty-nine.
So we decided to use the wording "approximate number range. If you
have more appropriate wording, please let us know.


■[×] ■18--- Accepted and the text will be modified and amended.

§3.1.1: the special treatment is mentioned only for vertical mode.
It seems that it could apply just as well in horizontal text.
It may be worth rephrasing the first paragraph:

[Current]

The space usually added after IDEOGRAPHIC COMMA "、" and the space
before and after KATAKANA MIDDLE DOT "・" are omitted, in principle,
for cosmetic reasons in the following cases.

to

[New]
The space usually added after IDEOGRAPHIC COMMA "、" and the space
before and after KATAKANA MIDDLE DOT "・" are generally omitted in some
cases, in vertical (but not in horizontal) text:

Also "3.1.3 b" will be modified amended as follows:

[current]
Ideographic digits and KATAKANA MIDDLE DOT "・" representing a decimal
point are set solid (as in the right line in [Fig.68]). In vertical
writing mode, when KATAKANA MIDDLE DOT "・" is used as a member of unit
symbols (cl-25) in unit symbols, grouped numerals (cl-24), and Western
characters (cl-27) in mathematical and chemical formulae, before and
after KATAKANA MIDDLE DOT "・" is set solid.

[New]
In vertical writing mode, Ideographic digits and KATAKANA MIDDLE DOT
"・" representing a decimal point are set solid (as in the right line
in [Fig.68]).
(note)
In horizontal writing mode, when KATAKANA MIDDLE DOT "・" is used as a
member of unit symbols (cl-25) in unit symbols, grouped numerals
(cl-24), and Western characters (cl-27) in mathematical and chemical
formulae, before and after KATAKANA MIDDLE DOT "・" is set solid.


■[X] ■19---Accepted and will be corrected in accordance with Japanese version.

§3.7.4, note 1: "Also, the handling of inter-character spaces between
these math symbols and adjacent characters is described in Appendix A
Character Classes as a complete table, in accordance with the concept
of character class, described in 3.9 About Character Classes."

First, Appendix A is only a description of the character classes, so
it seems that the reference was meant to be to Appendix B, "Spacing
between characters".
Second, table 2 does not include classes 17 and 18 at all.
Similarly, tables 4, 5, 6 and 7, do not include those classes.
In fact, there does not seem to be any description of the
inter-character spaces between math and other characters outside
3.7.4, so the note is misleading.

[Disposition]
Latter part of the text is misleading, so it will be deleted in
accordance with Japanese version.

[Current text]
(note 1)
The members of the math symbols (cl-17) and math operators (cl-18)
classes are described in 3.9 About Character Classes. Also, the
handling of inter-character spaces between these math symbols and
adjacent characters is described in Appendix A Character Classes as a
complete table, in accordance with the concept of character class,
described in 3.9 About Character Classes.
[New text]
(note 1) The members of the math symbols(cl-17) and math
operators(cl-18) classes are described in 3.9 About Character Classes.


■[X] ■20---Accepted and will be corrected

§3.7.4, item b., note 1.

It seems that the reference should be to figure 175 rather than 174.

[Disposition]
The figure number will be corrected.
[Current text]
(see [Fig.174])

[New text]
(see [Fig.176])

[Note]
The figure numbers of version 2 will be changed from current version.
The consistency of all figure numbers will be re-checked before the
publication.


■[×] ■21---Accepted and will be amended.

In Table 2 (Spacing between characters), the entry for cl-02 closing
brackets / line end refers to note 2, which says: The preferred
spacing between closing brackets (cl-02) and the line end is a half
em.

The alternative is to set solid (see 3.1.9 Positioning of Closing
Brackets, Full Stops, Commas and Middle Dots at Line End).

The corresponding entries in tables 4, 5, 6 are "1/2=0", "<blank>" and
"1/2=0" respectively.

To facilitate the correlation with tables 4, 5 and 6, it may be worth
to rephrase note 2, replacing "The alternative is" by "An alternative,
used in JIS X 4051, is".

[Disposition]
The text will be amended to explicitly refer JIS X 4051.

[Current Text]Apendix B.2. note 2.
The preferred spacing between closing brackets (cl-02) and the line
end is a half em. The alternative is to set solid (see 3.1.9
Positioning of Closing Brackets, Full Stops, Commas and Middle Dots at
Line End).

[New Text]
The preferred spacing between closing brackets (cl-02) and the line
end is a half em. The alternative is to set solid (see 3.1.9
Positioning of Closing Brackets, Full Stops, Commas and Middle Dots at
Line End). JIS X 4051 adopts this alternative.

Also, note 4 and note 6 will be amended as follows:

[Current Text]
4. The preferred spacing between middle dots (cl-05) and the line end
is a quarter em. The alternative is to set solid.
<snip>
6. The preferred spacing between full stops (cl-06) or commas (cl-07)
and the line end is a half em. The alternative is to set solid (see
3.1.9 Positioning of Closing Brackets, Full Stops, Commas and Middle
Dots at Line End).
[New text]
4. The preferred spacing between middle dots (cl-05) and the line end
is a quarter em. The alternative is to set solid.
<snip>
6. The preferred spacing between full stops (cl-06) or commas (cl-07)
and the line end is a half em. The alternative is to set solid (see
3.1.9 Positioning of Closing Brackets, Full Stops, Commas and Middle
Dots at Line End). JIS X 4051 adopts that the spacing between full
stops (cl-06) and the line end is a half em, the spacing between
commas (cl-07) is to set solid.


■[×] ■22---Partially accepted and will be modified.

Similarly for cl-05 middle dot / line end and note 4:

The preferred spacing between middle dots (cl-05) and the line end is
a quarter em.

The alternative is to set solid.

The corresponding entries in table 4, 5, 6 are "note 5", "<blank>" and
"<blank>" respectively.

To facilitate the correlation with tables 4, 5, 6, it may be worth to
rephrase note 4, replacing "The alternative is" by "An alternative,
used in JIS X 4051 and in books is".

Furthermore, note 5 in table 4 is:

Table 4, and only Table 4, allows the preceding and trailing
conditional quarter em space accompanying middle dots (cl-05) to be
reduced to leave no space.

The priority order is the third.

Wouldn't it be simpler to have "1/4-0" in table 4, and not have note 5?

This would give the same organization as for cl-02/line end.

[Disposition]
Annex B B.2 note 2 will be modified as mentioned in ■21.
Annex D D.2 note 5 will not changed.

[Rationale]
As for middle dots (cl-05), the spaces preceding and trailing the
middle dot will be reduced evenly together.


■[×] ■23--- Accepted and modified.

Similarly for cl-06, cl-07 / line end and note 6:

-> An alternative used in JIS X 4051, is to set use a half-em after a
full stop and to set commas solid.

[Disposition]
Accepted and will be modified as mentioned in ■21.


■[×] ■24--- Not accepted

Table 2, note 10:

When two adjacent characters belong to the same simple-ruby character
complex (cl-22) run, set them according to the method explained in
3.3.5 Positioning of Mono-ruby with Respect to Base Characters.

When two adjacent characters belong to two distinct simple-ruby
character complex runs, set them solid.

If two adjacent characters belong to the same simple-ruby character
complex, doesn't that imply a group ruby, and therefore the pointer
should be to 3.3.6?

[Disposition]
Not accepted.
[Rationale]
Even if there are adjacent simple-ruby groups(base character and ruby
letters group), and each simple-ruby group should be treated as one
object. Accordingly "3.3.5 Positioning of Mono-ruby with Respect to
Base Characters" shall be refered.



■[×] ■25--- Accepted and will be modified.

Table 4, 5, 6, note 1, called for cl-05/cl-05.

The note gives the behavior for tables 4 and 5, but not for table 6.
I supposed that the "Tables 4 and 5 allow..." should be "Tables 4, 5
and 6 allow..."

Same thing for note 2.

Furthermore, "Tables 4 and 5 allow the quarter em space accompanying
the trailing middle dot (cl-05) to be reduced, to leave no space as a
minimum" can be a bit misleading, as the half em on the full stop
remains.

May be "Tables 4 , 5 and 6 allow the quarter em space accompanying the
trailing middle dot (cl-05) to be reduced to zero, leaving only the
half em space accompanying the leading full stop (cl-06)."

Note 3: the behavior of table 4 and 5 is not specified.

I suppose that in table 4, both the half em after the comma and the
quarter em before the middle dot can be reduced to 0.

Table 5 being about JIS, the space is only a quarter em before the
middle dot, and I suppose this can be reduced to 0.

[Disposition]
Accepted and the text of D.2 note 3 will be modified as follows:

[Current Text]
The default unadjusted space when a comma (cl-07) is followed by a
middle dot (cl-05), is the sum of the conditional half em space
accompanying the preceding comma (cl-07) and the conditional quarter
em space accompanying the trailing middle dot (cl-05). Table 6 allows
the conditional half em space accompanying the trailing comma (cl-07)
to be reduced, to leave a quarter em space as a minimum. The priority
order in space reduction for the conditional space accompanying middle
dots (cl-05) is the fourth in Table 4, and it is the second priority
in Table 5. The priority order in space reduction for the conditional
space accompanying commas (cl-07) is the fifth in Table 4, and it is
the third priority in Table 5 and 6.

[New text]
The default unadjusted space when a comma (cl-07) is followed by a
middle dot (cl-05), is the sum of the conditional half em space
accompanying the preceding comma (cl-07) and the conditional quarter
em space accompanying the trailing middle dot (cl-05). Noted that in
Table 4, the size of space of space  of trailing middle dot(cl-05) is
a quarter em. In Table 3, the half em space of preceding comma (cl-07)
and a quarter em space of trailing middle dot (cl-05) can be reduced
to solid. In Table 4, the quarter em space of trailing middle dot
(cl-05) can be reduced to solid. In Table 5, the half em space of
preceding comma (cl-07) can be reduced to quarter em as a minimum. The
priority order in space reduction for the conditional space
accompanying middle dots (cl-05) is the fourth in Table 3, and it is
the second priority in Table 4.  The priority order in space reduction
for the conditional space accompanying commas (cl-07) is the fifth in
Table 3,  and it is the third priority in Table 5.

[Note]
The table numbers of version2 are smaller than version1 by one.


■[X] ■26---Accepted and will be ammended.

Appendix E.

I understand that to achieve a given line width by expansion: - the
western word spaces are enlarge up to 1/2 em - then the 2nd step 1/4
em spaces are expanded to 1/2 em - then the 3rd step 0 em spaces are
expanded to 1/4 em - then the 4th step 0 em spaces are expanded

What is not clear is whether there is a limit to the expansion of the
4th step 0 em spaces, and if so, what is expanded after this limit is
reached.

May be an example will help: consider <ideo, ideo, coma, ideo, ideo>.

That's normally 5em; how are the space distributed if that line is
justified to a width of 10em?

[Disposition]
Accepted and text of E.1 6.2.b.i will be amended as follows:

[Current Text]
Blank: Inter-character space expansion is not allowed because there is
no line break opportunity between the given combination of characters.

[New Text]
Blank: Inter-character space expansion is not allowed because there is
no line break opportunity between the given combination of characters.
When the 4th step is needed, add same space value to the spaces of
1st, 2nd, 3rd and 4th steps.



■[×] ■27--- Not accepted

The class cl-08 inseparable characters is made of

U+2014 — EM DASHU
U+2026 … HORIZONTAL ELLIPSIS
U+2025 ‥ TWO DOT LEADER
U+3033 〳 VERTICAL KANA REPEAT MARK UPPER HALFU
U+3034 〴 VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
U+3035 〵 VERTICAL KANA REPEAT MARK LOWERHALF

The treatment of those characters in line breaking is described by
note 5 on table 3:

There is no line break opportunity between consecutive inseparable
characters (cl-08) of the same kind.

If two consecutive inseparable characters (cl-08) are of different
kinds, a line break opportunity exists between them.

For example, a line shall not be broken between two consecutive EM
DASH "-"EM DASH "-" followed by HORIZONTAL ELLIPSIS "…". and the
treatment of those characters in space expansion is described by note
4 on table 7:

A third order opportunity exists for inter-character space expansion,
to take up to a maximum of a quarter em space, with respect to the
corresponding character size, between two consecutive inseparable
characters (cl-08) which are of different kinds.

It seems to me that the intent is that the sequences of characters: -
multiple U+2014 - multiple U+2015 - multiple U+2016 - <3033, 3035> -
<3034, 3035> be treated as if they were a single character (i.e. no
linebreak or character expansion in them).

― U+2015
‖ U+2016

The situation then looks a lot more similar to that of the class "unit symbols".

I would suggest to redefine class 08 along those lines, i.e. as a
class of sequence of characters.

Also, I am wondering if the last two sequences should not be treated
as cl-09 rather than cl-08.

[Disposition]
Not accepted.

[Rationale]
As for U+2051, there mentioned in note of A.8 cl-08 U+2041.
As for U+2016, it is not stably treated as inseparable character.

As for U+3033, U+3034, U+3035, U+3035 must follow U+3034 or U+3035
without space, so these character should be classified as inseparable
characters.

Noted that these characters are used as iteration marks not for a
character but for a couple of characters. The issue will be noted for
consideration in future versions together with the treatment of U+3031
and U+3032.

As for other similar symbols, there is no stable tradition to treat
them as inseparable characters.

■[X] ■28---Accepted and will be corrected

Table 7, note 10: "A fourth order opportunity exists for
inter-character space expansion...".

In the table itself, the color of the cell that refers to note 10 is
blue, i.e, third order.

Which is right? the note text or the cell color?

[Disposition]
Accepted and corrected.

[Current Text]
A fourth order opportunity exists for inter-character space expansion
between a preceding Western character (cl-27) and a trailing postfixed
abbreviation (cl-13), unless the preceding Western character (cl-27)
is used as a symbol of a quantity or a European numeral, in which case
no inter-character space expansion is allowed between them.

[New Text]
A third  order opportunity exists for inter-character space expansion
between a preceding Western character (cl-27) and a trailing postfixed
abbreviation (cl-13), unless the preceding Western character (cl-27)
is used as a symbol of a quantity or a European numeral, in which case
no inter-character space expansion is allowed between them.

■[×] ■29--- Accepted and corrected.

§3.8.4,items c and d, both refer to "bunrikinshi", but this term is
not defined and is not used anywhere else.

Appendix G has "buri kinshi", but this is the only occurrence, and it
is also weakly defined.

It seems that the only definition is provided by Table 7.

[Disposition]
There is a description of "burn kishi" in terminology section. So it
will be corrected in accordance with the terminology section.

[Current]
bunrikinshi

[New]
bunri kinshi

■[×] ■30--- Accepted and amended.

§3.9.2, Characters as reference marks (cl-20):

Characters which are inside verification seal (those are characters
inside a verification seal that appear in the line just after the item
applicable for reference marks of notes)

There is no other occurrence of the words "verification" and "seal",
which makes it a bit difficult to understand this text (what's a
"verification seal"? how does it related to reference marks of
notes?).

Also, a reference to§3.1.9, item j, may help.

[Disposition]
Accepted and the text will be amended as follows:

[Current Text]
Characters as reference marks (cl-20)
Characters which are inside verification seal (those are characters
inside a verification seal that appear in the line just after the item
applicable for reference marks of notes).
[New Text]
Characters as reference marks (cl-20)
Characters which are inside verification seal (those are characters
inside a verification seal that appear in the line just after the item
applicable for reference marks of notes). Refer to "4.2.2 Note
Numbers".
)の方が説明が詳しいので.

■[×] ■31--- Accepted and the title will be modified.

Table 5, the legend has "(The example that it followed the regulation
of JIS X 4051)" -> "(according to JIX X 4051)"

[Disposition]
Accepted and the title of Table 5 will be modified.

[Current title]
Table 5 Positions which allow for line adjustment by interletter
space-reduction (The example that it followed the regulation of JIS X
4051)
[New title]
Table 5 Positions which allow for line adjustment by interletter
space-reduction (The example according to JIS X 4051)

■[×] ■32--- Accepted and the title will be modified.

Table 6, the legend has "(The example being done in the book and so
on)" -> "(often used in book and similar materials)"

[Disposition]
Accepted and the title of Table 6 will be modified.

[Current title]
Table 6 Positions which allow for line adjustment by interletter
space-reduction (The example being done in the book and so on)

[New title]
Table 6 Positions which allow for line adjustment by interletter
space-reduction (The example being done in the book and similar
materials)







-- 
KOBAYASHI Tatsuo(小林龍生)
Scholex Co., Ltd. Yokohama
homepage) http://www.kobysh.com/tlk/
Received on Thursday, 23 December 2010 03:37:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:10:22 UTC