CSS3 Text: Multi-Directional Scripts in Vertical Inline Progression

http://fantasai.tripod.com/www-style/2003/directions/vertical-bidi.html

Text reproduced below for discussion. However, you'll probably find the
version with graphics somewhat easier to understand. :)

------------------------------------------------------------------------


BIDI in Vertical Context
------------------------

There are two major shortcomings in CSS3 Text's handling of
vertical inline progression. The first is that the interaction
with the BIDI algorithm is poorly and incorrectly expressed.
The second is that it cannot handle what I've found to be the
most common style of incorporating horizontal scripts into
vertical layout -- that is, having the horizontal script read
top to bottom regardless of its inherent direction -- without
creating a mess of bidi overrides and unnecessary markup.

The BIDI Problem
----------------

   Suppose I have the following sequence:

     [start] CHINA (zhong1 guo2) [end]

   If I render it horizontally, everything's fine.

   Now, I decide to lay this out vertically, in a left-to-right
   block progression. (Left-to-right text goes from bottom to
   top in a left-to-right block progression.) The block's
   'direction' is 'ltr', and all characters are L, so there's no
   reordering before 'auto' rotation takes effect. I get

     [bottom] CHINA (zhong1 guo2) [top]

   which is wrong, because going from top to bottom it should
   read "zhong1 guo2", not "guo2 zhong1".

   I decide, well, I'd rather have all the characters upright. So
   I apply "glyph-orientation-vertical: 0deg", which forces the
   Latin upright. Again, 'direction' is 'ltr', and all characters
   are L, so there is no reordering. I get

                             (
                            guo2
                           zhong1
                             )

                             A
                             N
                             I
                             H
                             C

   which isn't quite what I wanted. I could apply a BIDI override,
   of course, to make 'direction' and all the characters go right-
   to-left. However, if the block progression doesn't stay left-to-
   right (which it won't in most browsers due to lack of support--
   or maybe a vagrant user stylesheet), I'll get

     [left] (guo2 zhong1) ANIHC [right]

   which is just useless.

   Conclusion:
   To make upright vertical text go in the correct direction, all
   upright characters must behave as left-to-right characters in
   an 'rl' block progression and as right-to-left characters in
   an 'lr' block progression.

Vertical Writing Styles
-----------------------

Scripts can be classified into three categories:
   - horizontal (e.g. Arabic and Latin)
   - vertical (e.g. Mongolian)
   - bi-orientational (e.g. Chinese, Japanese, and Korean)
      ^ these behave as horizontal scripts in horizontal
      layout and vertical scripts in vertical layout; the
      glyphs don't get rotated.

Vertical scripts don't read bottom to top. Similarly,
most horizontal scripts don't read backwards. (Since CSS
doesn't support scripts that alternate directions, we can
assume this is the case for all scripts.)

Given a block progression, there are three ways of orienting
scripts that don't match the block's orientation:

   - "natural", where the text orients itself wrt
     the block progression. For example, English text
     in a left-to-right block progression naturally
     reads bottom-to-top. (Think table headers.)

   - "context", where the text takes on the inline
     progression of the containing block. For example,
     Latin text in Mongolian will often read top-to-
     bottom (rotated 90deg clockwise), even though its
     natural direction for left-to-right block
     progression is from bottom-to-top.

   - "upright", where the text takes on the inline
     progression of the containing block but also
     forces the glyphs to be set upright. You see
     this in Motel signs, book covers, and the like.
     (It's not used with cursive scripts afaik.)

For examples of these styles in print, see Scans at
http://fantasai.tripod.com/www-style/2003/directions

The "Natural" Orientation Style
-------------------------------

   Text is laid out with respect to the block progression. Horizontal
   scripts simply behave as if the 'before' edge was the top edge, and
   orient and reorder their glyphs accordingly. Vertical scripts are
   laid out from top to bottom, with the top edge of each glyph towards
   the top of the block.

   BIDI reordering is applied to all text. However, all directional
   characters in vertical scripts are treated as
      - L (left-to-right) if 'block-progression' is 'rl'
      - R (right-to-left) if 'block-progression' is 'lr'

   If the element's dominant script is 'HAN', 'HIRAGANA', 'KATAKANA',
   'HANGUL', 'BOPOMOFO', or 'MONGOLIAN', then any available vertical
   glyph variants should be used for punctuation characters. Otherwise,
   horizontal punctuation glyphs should be used, rotated so the top
   edge faces the 'before' edge of the block.

The "Context" Orientation Style
-------------------------------

   Text is laid out from top to bottom regardless of inherent direction.
   The BIDI algorithm is applied. However, reordering is not applied to
   "context"-styled text. Instead,
      - glyphs for all characters in even embedding levels are rotated
        90 degrees clockwise
      - glyphs for all characters in odd embedding levels are rotated
        90 degrees counter-clockwise
      - all vertical script glyphs are oriented with their
        bottoms toward the bottom of the block
   Also, the boundaries of "context"-styled text have fixed directionality;
   they are:
      - L (left-to-right) if 'block-progression' is 'rl'
      - R (right-to-left) if 'block-progression' is 'lr'

   If the element's dominant script is 'HAN', 'HIRAGANA', 'KATAKANA',
   'HANGUL', 'BOPOMOFO', or 'MONGOLIAN' any available vertical glyphs are
   used for punctuation. Otherwise, horizontal punctuation glyphs are used,
   rotated in the appropriate direction.

The "Upright" Orientation Style
-------------------------------

   Text is laid out from top to bottom regardless of inherent direction.
   The BIDI algorithm does not internally affect "upright"-styled text.
   However, the boundaries of "upright"-styled text have fixed
   directionality and the entire run of text behaves as if embedded at
   an infinitely high embedding level when interacting with BIDI
   reordering applied to surrounding text. The text behaves as if it had
     - an even embedding level and L (left-to-right) directionality at
       the boundaries if 'block-progression' is 'rl'
     - an odd embedding level and R (right-to-left) directionality at
       the boundaries if 'block-progression' is 'lr'

   All grapheme clusters are oriented with their bottom towards the
   bottom of the block and laid out each below the previous.

   Vertical alternates of the glyphs are used. Enclosing punctuation
   such as parentheses should thus be rotated to face in to the text
   they enclose, and exclamation points should be upright.


CSS3 Text currently only provides for "natural" styling, really. It
can be forced to do "context" or "upright", but getting correct
results requires awkward overrides and, in many cases, extra markup.
To keep authors from using such overrides and related markup and/or
scripting, I propose that CSS3 Text provide for all three orientation
styles.

I'd have posted this as a Last Call comment. However, it's taken days
of research, reading, and (mostly) thinking to sort this all out.


Acknowledgements: Many thanks go to Martin Heijdra for taking the time
                   to explain various scripts, their typing, and their
                   typography, and for letting me borrow books from his
                   private collection. (The Mongolian texts are his.)

                   Thanks also go to Ian Hickson for offering to host my
                   scans, should Tripod's limitations be a problem. :)

                   Thanks also go to the Library! Hurray! The other books
                   were borrowed from Firestone and Gest.

~fantasai

Received on Friday, 21 March 2003 17:42:38 UTC