W3C home > Mailing lists > Public > whatwg@whatwg.org > July 2009

[whatwg] Whitespace handling in ruby

From: Roland Steiner <rolandsteiner@google.com>
Date: Thu, 30 Jul 2009 12:22:56 +0900
Message-ID: <ee6bf5be0907292022x71633cfbu93027a4c01d83571@mail.gmail.com>
As I am currently writing an implementation for ruby rendering, I wondered
about the exact way white-space is supposed to be handled between runs of
ruby text.

As far as I see it, <ruby> is fundamentally an inline element, and thus
whitespace would normally be collapsed, but not entirely eliminated.
However, for the examples given for the <ruby> element, this would result in
a single whitespace between the ideographic characters:

<ruby> *[ws]*
?<rp>(</rp><rt>??</rt><rp>)</rp> *[ws]*
?<rp>(</rp><rt>?</rt><rp>)</rp> *[ws]*
</ruby>

rendered without ruby support would become (easier for e-mail):

?(??)* [ws]* ?(?)

The whitespace would also be present with proper ruby rendering above the
base characters.

OTOH, removing those white-spaces may not be desirable if the bases are not
ideographic scripts, e.g.:

<ruby>
European<rp>(</rp><rt>E</rt><rp>)</rp>
Union<rp>(</rp><rt>U</rt><rp>)</rp>
</ruby>

(This example has yet another drawback: the white-space before "Union" would
become part of the base and thus shift the annotation "U" slightly left of
the center of the word "Union".)

For the time being I'm using a block-based rendering approach that
automatically eliminates leading and trailing white-space in the base text,
but I wondered what the correct approach would be within the scope of HTML5
(aside: an XHTML-like explicit <rb> container for the ruby base side-steps
this problem, but is not a real option due to need for legacy support).


- Roland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090730/b9d99752/attachment-0001.htm>
Received on Wednesday, 29 July 2009 20:22:56 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:14 UTC