- From: fantasai <fantasai.lists@inkedblade.net>
- Date: Fri, 09 Nov 2012 10:16:34 -0800
- To: Richard Ishida <ishida@w3.org>
- CC: Bruce Lawson <brucel@opera.com>, public-html@w3.org, www International <www-international@w3.org>
On 11/09/2012 07:42 AM, Richard Ishida wrote:
> On 05/11/2012 14:09, Bruce Lawson wrote:
>> On Fri, 02 Nov 2012 10:45:36 -0000, Robin Berjon <robin@w3.org> wrote:
>>
>>>
>>> ### Forward-looking ruby model
>>> Fantasai exposed a set of issues with the current ruby markup that
>>> make it awkward to extend in future for features that we have good
>>> reasons to believe will become increasingly common as HTML is used for
>>> books, scientific publishing, and pretty much everything in the world
>>> in general. These involve jukugo ruby, fallback, double-sided ruby.
>>
>> is this set of issues written up anywhere?
>
>
> Bruce, see http://www.w3.org/TR/ruby-use-cases/. Fantasai also wrote
> something in a blog post that I tried to represent in the aforementioned doc.
Here's the blog post:
http://fantasai.inkedblade.net/weblog/2011/ruby/
A key point that's not in the blog post is that there are two fundamentally
different models for doing ruby:
row-based model
This is the XHTML Ruby approach, where all the base text is given,
followed by all the annotations, row by row.
column-based model
This is the HTML Ruby approach, where each base is given followed
immediately by its annotation(s), column by column.
The column-based model has several flaws:
1. It doesn't handle inlining gracefully. As an example, the word
Tokyo is written 東京 in kanji and とうきょう in kana. The base-text
pairs are 東-とう 京-きょう, and the ruby markup must create those
associations accordingly. However, when rendered inline, the
correct rendering is
東京(とうきょう)
with the word kept together as one unit, not
東(とう)京(きょう)
There are various use cases for inlining:
* fallback, for implementations that don't support ruby.
* compacting the layout, because ruby requires higher inter-line
spacing. (If ruby is rare enough in the document, it's more
efficient to present it inline, and this has been a desired
option on phones.)
* small fonts -- in order to fit above the base text, ruby is
typically written about half as small as the base text. If
the base font size is too small it can become unreadable,
especially for older people. Inlined annotations on the other
hand are the same size as the base text.
The author and the UA should have the choice of proper inlining
without changing the markup. Doing that with the current markup
requires special box-reordering support in the layout engine,
which is doable but not trivial and certainly does not solve the
fallback use case.
2. It doesn't handle spanning gracefully, i.e. the case where there
are multiple annotations and their boundaries don't line up.
See http://fantasai.inkedblade.net/weblog/2011/ruby/#double for
examples.
Hixie recently added the ability to do two types of double-sided
ruby to try to address this use case, but used completely different
markup models: one case would be done with nested <ruby> tags, and
the other with multiple adjacent <rt> elements. The problem with
this is that
* it forces the author to learn (and style) two very different
markup models for things that are fundamentally the same
* it forces the UA to implement two very different layout models
for things that are fundamentally the same
One of the complexities of ruby layout that is overlooked is that
adjacent ruby on a single line need to negotiate space from each
other. In the simple case, they are black boxes of a particular
size: if the annotation text is wider than the base text, the
inline is treated as having the size of its annotation. But this
is not always the desired rendering. In many cases it's desired
for a long annotation to overhang adjacent text *if that text is
not itself annotated* and there is therefore sufficient room for
the overhang. So inline layout needs to negotiate space for
annotations among ruby structures on the same line, across inline
element boundaries, etc.
Another of course is negotiating line-breaks within the ruby among
the base text and its annotations.
So not only does this approach require the author to learn two
different models, it also requires the layout engine to implement
two different models and handle their interactions.
Personally, I don't see why we are insisting on this approach when
there is a sensible alternative that puts all forms of ruby on the
same track and allows for whatever extensions we might want from
now through 2025 to be handled within the same basic architecture.
Note, I'm not advocating that the current model for single-sided ruby,
which is implemented in WebKit and Trident already, should be abandoned.
It's fairly easy to incorporate that into a box model that extends it
into a row-based system. I'm saying we shouldn't shoehorn additional
requirements into that model as hixie has done, dropping some of them
on the floor as necessary, but instead extend in the direction of a
model that satisfies the all requirements with a single unified model.
I think this is less complex and more satisfying than the current
approach.
~fantasai
Received on Friday, 9 November 2012 18:17:04 UTC