- From: Eric Muller <emuller@adobe.com>
- Date: Wed, 18 Jan 2012 10:54:00 -0800
- To: <public-i18n-cjk@w3.org>
Sorry for the late answer. On 1/12/2012 5:03 PM, Leif Halvard Silli wrote: >> Both >> >> <ruby><rb>東</rb><rt>とう</rt><rb> 京< /rb><rt>きょう</rt></ruby> (may >> be with a different interleaving of rbs and rts) >> >> and >> >> <ruby>東<rt>とう</rt>京<rt>きょう</rt>< /ruby> >> >> capture the list of pairs {東, とう}, {京, きょう} equally well. > > Why is *any* of the two examples above any better than > this: > > <ruby><rb>東</rb><rt>とう</rt></ruby><ruby><rb> 京< /rb><rt>きょう > </rt></ruby> In jukugo ruby, the ruby text of one pair is allowed to be displayed overhanging an adjacent base. (The constraint that must be respected - and makes it different than a big group ruby - is that for each pair, some of a ruby text must be above its base - typically, 1 ruby character worth). Jukugo is used when the base texts form a compound, in which case the partial confusion (about which ruby is for which base) is deemed acceptable. In exchange, it allows for text that stays more on the grid. For example, if you have a 3 kanji compound, 1 kana ruby on the first, 3 kana ruby on the second, and 2 kana ruby on the third, the leftmost of the 3 kana overhangs the first kanji, and you end with a nice three em fragment, on the grid; the ruby does not cause the line to be set differently than without the ruby. When you have two adjacent base texts each with ruby but those two base texts do not form a compound, then the ruby of one cannot overhang the base text of the other. Using the same example of 1, 3, 2 kana ruby, but considering it as three separate ruby in succession. the 3 kana of the middle one cannot overhang on either side, so that middle part will have to be 1.5 em, and some of the line is therefore no longer on the grid. The layout is not as nice, but now it's completely unambiguous as to which ruby goes with each base. Given the same characters, and in fact the same pairs, the decision to treat those pairs as jukugo or not is based on the semantics of the text. It seemed obvious to me that using a single <ruby> vs multiple <ruby> was the only way to go, but you are right that I did not made that clear. Note that this is not entirely different from the underline problem which was discussed on www-style not too long ago: <u>A</u><u>BC</u> is considered distinct from <u>ABC</u> (and a fortiori from three successive <u>), especially in CJK world. The underline is used on names, and reflecting the parts of the name, as in (A)(BC) is deemed important. Eric.
Received on Wednesday, 18 January 2012 18:54:34 UTC