- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Fri, 13 Jan 2012 02:03:39 +0100
- To: public-i18n-cjk@w3.org
Eric Muller Thu, 12 Jan 2012 12:46:50 -0800
> Even in the case of c., the issue from the point of view of document
> content (i.e. ignoring for one second the application of styling), is to
> represent a list of pairs {base text, ruby text}. Both
>
> <ruby><rb>東</rb><rt>とう</rt><rb> 京< /rb><rt>きょう</rt></ruby> (may
> be with a different interleaving of rbs and rts)
>
> and
>
> <ruby>東<rt>とう</rt>京<rt>きょう</rt>< /ruby>
>
> capture the list of pairs {東, とう}, {京, きょう} equally well.
Eric: With regard to your two examples above, then *none* of them are
part of the XHTML Ruby Module, AFAICT. See
<http://www.w3.org/TR/ruby/>. Therefore, I have a single question about
both of them: Why is *any* of the two examples above any better than
this:
<ruby><rb>東</rb><rt>とう</rt></ruby><ruby><rb> 京< /rb><rt>きょう
</rt></ruby>
or this:
<ruby>東<rt>とう</rt></ruby><ruby>京<rt>きょう</rt></ruby>
As far as I can tell, these two variants each picks up the list of
pairs equally well too, not? What is the purpose - the progress - of
using a single <ruby> rather than two <ruby>s?
You see, this is - I believe - where HTML5 deviates from the XHTML Ruby
module. The omission of <rb> is already catered for in the XHTML Ruby
module: It is considered non-conforming, but it is described how to
handle it. For instance XHTML Ruby module has this, 'simple ruby'
<http://www.w3.org/TR/ruby/#simple-ruby1>:
<ruby>
<rb>WWW</rb>
<rp>(</rp><rt>World Wide Web</rt><rp>)</rp>
</ruby>
Per the HTML5 model, one could do this:
<ruby>
<rb>W</rb>
<rp>(</rp><rt>World</rt><rp>)</rp>
<rb>W</rb>
<rp>(</rp><rt>Wide</rt><rp>)</rp>
<rb>W</rb>
<rp>(</rp><rt>Web</rt><rp>)</rp>
</ruby>
Which in a text browser would look like this:
W (World) W (Wide) W (Web)
Is this any useful? Instead, per the XHTML Ruby module's complex
markup, one could do this:
<ruby>
<rbc>
<rb>W</rb><rb>W</rb><rb>W</rb>
</rbc>
<rtc>
<rt>World</rt><rt>Wide</rt><rt>Web</rt>
</rtc>
</ruby>
Which in a Text browser could render:
WWW World Wide Web
If I understood Richard's Wiki page correctly (I'm assuming there was a
typo - see my previous reply), then it suggested this option:
<ruby>
<rb>W</rb><rb>W</rb><rb>W</rb>
<rp>(</rp><rt>World</rt><rt>Wide</rt><rt>Web</rt><rp>)</rp>
</ruby>
Which in a text browser could look like this:
WWW (World Wide Web)
> Both approaches work, but requiring <rb> makes it slightly easier to
> manipulate documents; to access a base text, one can simply grab the
> <rb> element, instead of grabbing all the elements other than <rt>. (In
> XSLT, group-adjacent="if (self:rt) then 'rt' else 'basetext'" does the
> trick, but works only in a for-each-group if I am not mistaken, not on
> direct access to the nth base text).
Agreed!
> I would not characterize approach 3 (in section 2) as an alternative to
> 1 and 2. It is available to authors under 1, but it does not help
> consumers (unless the <span> is required, at which point that <span> is
> just another name for <rb>). From the point of view of consumers, it's
> really the same as approach 1, used in a restricted way.
In which document did you find these 'approaches'? The Wiki page? URLs
would be handy, then, please ...
But I agree that <span> would be just a another name for <rb>. I don't
get why the HTML5 editor is so hung up in <rb>. I think it must be
based on the fact that IE6/7/8 doesn't understand <rb>. However, that
probablem can easily be dealt with (via JavaScript), and IE9 does
support <rb> - the same way it supports <span>.
> It seems to me that approach 4 introduces a new selector mechanism, and
> I don't think that's desirable.
>
> One question which is more apparent from my a/b/c organization is
> whether b should have a different DOM than c. As far as I can tell, b is
> just a succession of single ruby, and there is therefore no strict need
> to represent that situation by a single <ruby> element. Allowing b to
> be done by a single <ruby> element with multiple pairs, as a convenience
> to authors, means the same DOM as for a jukugo ruby (I believe this is
> what motivated your approach 2 in "4 jukugo ruby", as well as your
> discussion of fallback). If that convenience is offered, then one will
> have to have something in CSS to express b. vs. c, and rendering engines
> will have to consult that even when doing fallback, to determine whether
> to do 東(とう)京(きょう) or 東京(とうきょう).
I think that a different DOM is very difficult. In the bug report about
inclusion of <rb>, there were many points about the similarity of <dl>
and <ruby>. As we know, in <dl>, then the DOM is quite "normal". But of
course I agree with the problem. And I think the problem should be
solved by doing/allowing what I think Richard discussed:
<ruby>
<rb>W</rb><rb>W</rb><rb>W</rb>
<rp>(</rp><rt>World</rt><rt>Wide</rt><rt>Web</rt><rp>)</rp>
</ruby>
We could also do this, where the <rbc> would come in handy
<ruby>
<rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc>
<rp>(</rp><rt>World</rt><rt>Wide</rt><rt>Web</rt><rp>)</rp>
</ruby>
But, if we want to be Webkit and Firefox compatible, then we could not
do this:
<ruby>
<rbc><rb>W</rb><rb>W</rb><rb>W</rb></rbc>
<rtc><rp>(</rp><rt>World</rt><rt>Wide</rt><rt>Web</rt><rp>)</rp></rtc>
</ruby>
Why? Because Firefox 4/5/6/7/8/9 and Webkit (since Safari 5) will
auto-close the current element, when the parser sees <rp> or <rt>. At
least those browsers would need to change, if we were to include <rtc>.
(Actually, even if they currently do this, it can still be useful t
include rtc{} as a CSS hook. (More on this later.)
> I don't know whether
> Japanese users view b. and c. as just different styling or as
> semantically different. The former permits b. to be represented by a
> single <ruby> and to make the distinction in CSS. The later either
> requires b. to be done by multiple <ruby> or something additional in
> HTML if one want to do b. with a single <ruby>.
>
> Seems to me that mandatory <rb> makes life easier, and IMO easier enough
> that it's justified, but is not strictly necessary.
>
> A decidedly inferior scenario, is to make <rb> optional. A <span> does
> just as well in this case.
I agree slightly with the view that <rb> should have been obligatory.
But even if it is optional, an authoring tool could treat <rb> as
obligatory - it could autoinsert it. When it comes to <span>, then why
not <b>? This would have to be decided case by case. THus there is
definitely an advantage to permitting <rb>, anyhow.
--
Leif H Silli
Received on Friday, 13 January 2012 01:06:48 UTC