W3C home > Mailing lists > Public > public-i18n-cjk@w3.org > January to March 2012

FW: ruby and rb tag

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Wed, 25 Jan 2012 00:06:49 -0500
To: "CJK discussion (public-i18n-cjk@w3.org)" <public-i18n-cjk@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0D3297CD0A@MAILR001.mail.lan>
Since Leif mentioned he'd like to hear what implementers would say, and I agree with that, I contacted Roland at Google, who implemented ruby on WebKit and got his response. I'm forwarding this with his permission. I'll forward another one that followed this as well.

-----Original Message-----
From: Roland Steiner
Sent: Monday, January 23, 2012 3:51 PM
To: Koji Ishii
Subject: Re: ruby and rb tag

Hi Koji,

Sure! I've seen the mails with the link to the wiki page - I meant to reply earlier, but then got swamped with other stuff.

As for implementing an <rb>: I haven't thought about it in deep detail (and have been away from <ruby> stuff for quite a while now), but I don't think it would be very hard. My very first implementation used <rb>, although that one was never deployed nor tested. Also, for rendering, WebKit even now creates an anonymous RenderRubyBase to render the base text. Associating that with an actual <rb> tag should be straightforward. 

By 'optional' - do you mean an implicit <rb> would be created in the DOM if none is specified? IOW, given <ruby>BASE<rt>TEXT</rt></ruby>, would the resulting DOM be 

    a) <ruby>BASE<rt>TEXT</rt></ruby>
    b) <ruby><rb>BASE</rb><rt>TEXT</rt></ruby>

? Both have their respective advantages and disadvantages

a) is simpler with regard to parsing and more robust w/ regard to DOM manipulation
b) is simpler with regard to rendering


Question regarding mixed usage: What would the following mean?

    <ruby>BASE <rb>BASE</rb><rt>TEXT</rt></ruby>

is it 

    a.1) <ruby><rb>BASE</rb><rb>BASE</rb><rt>TEXT</rt></ruby>
    a.2) <ruby><rb>BASE </rb><rb>BASE</rb><rt>TEXT</rt></ruby>
    b.1) <ruby><rb>BASEBASE</rb><rt>TEXT</rt></ruby>
    b.2) <ruby><rb>BASE BASE</rb><rt>TEXT</rt></ruby>

(where a.1/b.1 and a.2/b.2 differ by white-space only) ?


Inversion of base and text: Would

    <ruby><rt>TEXT1</rt><rb>BASE1</rb></ruby>

be allowed? If yes, that'd add considerable complexity.


In this regard, I believe the discussion is not fully complete without knowing which direction complex ruby will take. I fear adding <rb> now without having a plan how to use it down the road might turn out a stumbling block later on. For example:

    <ruby><rb>BASE1</rb><rb>BASE2</rb><rt>TEXT1</rt></ruby>

Should TEXT1 annotate BASE1 or BASE2? Currently it'd be BASE2. Now suppose we append <rt>TEXT2</rt>:

    <ruby><rb>BASE1</rb><rb>BASE2</rb><rt>TEXT1</rt><rt>TEXT2</rt></ruby>

Will this result in a) 2 ruby bases with 1 text each, or b) 1 base without text and 1 base with 2 texts, or c) 1 base without text, 1 base with 1 text and 1 text without base? Currently it'd be c).


On the topic of inter-element whitespace: this is something that I feel is often conveniently ignored with <ruby> (and admittedly doesn't matter so much in CJK), but can lead to ugly results. Consider:

    <ruby>
        Cascading <rt>C</rt>
        Style <rt>S</rt>
        Sheets <rt>S</rt>
    </ruby>

will currently squash everything together to 'CascadingStyleSheets' in the base. IOW, there is no way besides "&nbsp;" or splitting the <ruby> to add spaces between the bases: If one uses "&nbsp;" the space is considered part of the base, which misaligns the text, while splitting the <ruby> will preclude wrapping all the texts into a single bracketed "(CSS)" down the line. <rb> would give one the chance to better delineate what is part of the base and what isn't:

    <ruby><rb>Cascading</rb><rt>C</rt> <rb>Style</rb><rt>S</rt> <rb>Sheets</rb><rt>S</rt></ruby>

OTOH if the spaces are preserved (say, by some white-space rule on the <ruby>) outside of the regular bases, what about

    <ruby><rb>Cascading</rb> <rt>C</rt> / <rb>Style</rb> <rt>S</rt> / <rb>Sheets</rb> <rt>S</rt></ruby>

(note the spaces between <rb> and <rt>, and the non-space '/' outside both <rb> and <rt>) ?


Sorry for derailing this from an implementation-only discussion... ^_^;


Cheers,

- Roland

On Sun, Jan 22, 2012 at 22:37, Koji Ishii <kojiishi@gluesoft.co.jp> wrote:
Hi Roland, long time no talk but I hope all is well with you and you remember me :)

Would you mind to help us for 5 minutes?

I'm still working at CSS WG and I18N WG. One of the topic for I18N WG these days is the <rb> tag in HTML5[1]. It's taking really long to resolve as you might already know, but I think we're getting close to end the game.

Currently, we as I18N WG would like to add <rb> tag back as an optional tag, so that it can be omitted for simple case, but still usable for complex cases.

Among the members, fantasai and I are also hoping UA to imply <rb> tag if omitted, in a similar manner we do for <tbody> tag today. People there understand its benefits, but one guy said that we should ask implementers how feasible it is, and I can't find anyone but you to ask this question to.

I'm not asking if you or Google will do this or not, but rather, how feasible, easy, or difficult thing it is from one developer's point of view. Could I have your opinion on this?

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=10830

Regards,
Koji
Received on Wednesday, 25 January 2012 05:09:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 January 2012 05:09:59 GMT