- From: Eric Muller <emuller@adobe.com>
- Date: Wed, 18 Jan 2012 10:54:00 -0800
- To: <public-i18n-cjk@w3.org>
Sorry for the late answer.
On 1/12/2012 5:03 PM, Leif Halvard Silli wrote:
>> Both
>>
>> <ruby><rb>東</rb><rt>とう</rt><rb> 京< /rb><rt>きょう</rt></ruby> (may
>> be with a different interleaving of rbs and rts)
>>
>> and
>>
>> <ruby>東<rt>とう</rt>京<rt>きょう</rt>< /ruby>
>>
>> capture the list of pairs {東, とう}, {京, きょう} equally well.
>
> Why is *any* of the two examples above any better than
> this:
>
> <ruby><rb>東</rb><rt>とう</rt></ruby><ruby><rb> 京< /rb><rt>きょう
> </rt></ruby>
In jukugo ruby, the ruby text of one pair is allowed to be displayed
overhanging an adjacent base. (The constraint that must be respected -
and makes it different than a big group ruby - is that for each pair,
some of a ruby text must be above its base - typically, 1 ruby character
worth).
Jukugo is used when the base texts form a compound, in which case the
partial confusion (about which ruby is for which base) is deemed
acceptable. In exchange, it allows for text that stays more on the grid.
For example, if you have a 3 kanji compound, 1 kana ruby on the first, 3
kana ruby on the second, and 2 kana ruby on the third, the leftmost of
the 3 kana overhangs the first kanji, and you end with a nice three em
fragment, on the grid; the ruby does not cause the line to be set
differently than without the ruby.
When you have two adjacent base texts each with ruby but those two base
texts do not form a compound, then the ruby of one cannot overhang the
base text of the other. Using the same example of 1, 3, 2 kana ruby, but
considering it as three separate ruby in succession. the 3 kana of the
middle one cannot overhang on either side, so that middle part will have
to be 1.5 em, and some of the line is therefore no longer on the grid.
The layout is not as nice, but now it's completely unambiguous as to
which ruby goes with each base.
Given the same characters, and in fact the same pairs, the decision to
treat those pairs as jukugo or not is based on the semantics of the
text. It seemed obvious to me that using a single <ruby> vs multiple
<ruby> was the only way to go, but you are right that I did not made
that clear.
Note that this is not entirely different from the underline problem
which was discussed on www-style not too long ago: <u>A</u><u>BC</u> is
considered distinct from <u>ABC</u> (and a fortiori from three
successive <u>), especially in CJK world. The underline is used on
names, and reflecting the parts of the name, as in (A)(BC) is deemed
important.
Eric.
Received on Wednesday, 18 January 2012 18:54:34 UTC