- From: fantasai <fantasai@escape.com>
- Date: Mon, 07 Apr 2003 11:03:03 -0400
- To: www-style@w3.org
Etan Wexler wrote: > fantasai wrote: >>http://fantasai.tripod.com/www-style/2003/directions/vertical-bidi.html > > Congratulations on another clear, good-looking, and well-considered > document. You are truly an asset to the CSS community. Thank you for your generous compliments, Etan, but you really shouldn't flatter me so. ;) >>I propose that CSS3 Text provide for all three orientation >>styles. > > You have not proposed the mechanism. Do we press 'writing-mode' into service > with new values? Do we add values to 'direction'? Do we coin a shorthand for > 'direction' and the glyph orientation properties? I don't think we can use writing-mode, as it's already committed to setting progressions. 'direction' shouldn't be used because it's purpose is to set the directionality of an element's text. This is actually different from the inline progression - Consider the following character stream: [start] <element>STRANGE BUT POSSIBLE</element> [end] If for some reason this text is supposed to run right-to-left, I need to apply a bidi override. (I assume that's what it's for.) I can do so with CSS: element { direction: rtl; unicode-bidi: override; } The result is ELBISSOP TUB EGNARTS <<<< read this way (r-l) Note that this ordering *is correct*. It's not a stylistic effect. It is necessary for correct interpretation of the text. Now, let us suppose that this text is to be laid out vertically, in a right-to-left block progression. Being rtl text, it's natural orientation is to run bottom to top with the glyph-orientation at 90deg. [top] ELBISSOP TUB EGNARTS [bottom] <<< read this way (b-t) In this case, the inline progression is also "right-to-left", if you take the right edge to be the "top". However, if I decide to use the "upright" style, with glyph-orientation 0deg, the text must run S T R A N read this way (t-b) G | E V B U T P O S S I B L E Note that although the directionality is still rtl, the inline progression is top-to-bottom, or "left-to-right". If we assigned 'direction' to the role of controlling inline progression, we'd have to set "direction: ltr" to have the text display "upright". Assuming 'block-progression' remains 'rl', this is fine. However, in most cases, 'block-progression' will not remain 'rl'. The most common reasons would be lack of support and user prefs. The author sheets, also, could reset 'block-progression' later in the cascade. (Such a situation can be expected in complex style sheets.) What happens if 'block-progresion' becomes 'tb'? We get STRANGE BUT POSSIBLE which is *wrong*. It's backwards. ------------------------------------------------------------------------------ The Inline Progression The inline progression isn't necessarily another independent variable for CSS to control. Although it is _distinct_ from directionality, it can be described as a function of the block progression, text directionality, and glyph orientation. In an absolute model (degrees always clockwise), a table listing the properties of a same-direction text run and the resulting inline progression would look like this: block progression directionality glyph orientation | inline progression ----------------- -------------- ----------------- | ------------------ horizontal Left to right 0deg | top-to-bottom horizontal Right to left 0deg | top-to-bottom horizontal Left to right 90deg | top-to-bottom horizontal Right to left 90deg | bottom-to-top horizontal Left to right 180deg | bottom-to-top * horizontal Right to left 180deg | bottom-to-top * horizontal Left to right 270deg | bottom-to-top * horizontal Right to left 270deg | top-to-bottom * (The headers do not represent CSS properties.) You can see that the inline progressions for 180 degrees and 270 degrees don't match the current text for 'glyph-orientation' and 'direction'. I will say right now that I've never seen glyphs rotated 180 degrees in running text, but I'd expect it to read bottom to top as well as have its glyphs rotated bottom to top. As for 270 degrees, the net effect with the current definition of 'glyph-orientation' would be similar to a bidi override after glyph selection. Text is set as for 90 degrees, but then each glyph is rotated 180 degrees more in place -- it's upside down with respect to the inline progression. This is, in effect, what happened to the Farsi text in http://fantasai.tripod.com/www-style/2003/directions/flow-diagram2.gif and it's unreadable. --------------------------------------------------------------------------------- Script Types: Scripts can be classified according to their directionality, as they are in Unicode. Unfortunately, Unicode only defines horizontal directionality even though vertical and bi-orientational scripts have vertical directionality as well, . For example, while English can go either top or bottom or bottom to top (since it doesn't have a vertical directionality), Japanese should only go from top to bottom, even in an 'lr' block progression. Mongolian also has top-to-bottom vertical directionality. Unlike Japanese however, it has no definite horiziontal directionality--only a preferred left-to-right directionality as assigned in Unicode. Bi-orientational scripts may be further classified by how their glyphs transform when switching orientations. CJK characters translate; they are always upright. Other scripts, such as Ogham and some variants of classical Yi, must be rotated. So, to summarize, scripts possess the following properties: Orientation: horizontal (e.g. Latin) vertical (e.g. Mongolian) bi-orientational (e.g. Han) Horizontal directionality: left-to-right (e.g. Devanagari) right-to-left (e.g. Arabic) none (e.g. Mongolian) "None" applies to vertical scripts; Unicode does assign a preferred direction, though, which should be honored by default. Vertical directionality: top-to-bottom (e.g. Katakana) bottom-to-top (e.g. Ogham) none (e.g. Arabic) "None" applies to all horizontal scripts. Bi-orientational transform: rotational (e.g. Ogham) translational (e.g. Han) [See Appendix A for a partial table of scripts] Characters defined as "Wide" in Unicode are treated as bi-orientational with translational transformation. --------------------------------------------------------------------------------- Text-Orientation: So that I have a clean slate to work with, I am now going to define a new property, 'text-orientation-vertical', based on the styles described in http://fantasai.tripod.com/www-style/2003/directions/vertical-bidi.html An upright glyph (oriented 0 degrees) is defined to be the orientation it appears in in the Unicode code charts. For horizontal and bi-orientational scripts, this is the normal orientation in horizontal text. For vertical scripts, this is the normal orientation in vertical text. 'text-orientation-vertical' takes the following values: '0deg' All glyphs are oriented upright and each line of text is laid out from top to bottom. There is no BIDI reordering within the element. In an 'lr' block progression, all directional characters are ordered as right-to-left characters (R) in the BIDI algorithm. In an 'rl' block progression, all directional characters are ordered as left-to-right characters (L) in the BIDI algorithm. In both cases the glyph orientation is 0 degrees, and any available vertical glyph variants should be used. Enclosing punctuation should thus face inward. If the font does not have vertical variants of such punctuation, the user agent may rotate the horizontal glyph. '180deg' All glyphs are oriented upside down and each line of text is laid out from bottom to top. There is no BIDI reordering within the element. In an 'lr' block progression, all directional characters are ordered as left-to-right characters (L) in the BIDI algorithm. In an 'rl' block progression, all directional characters are ordered as right-to-left characters (R) in the BIDI algorithm. In both cases the glyph orientation is 180 degrees, and any available vertical glyph variants should be used. Enclosing punctuation should thus face inward. If the font does not have vertical variants of such punctuation, the user agent may rotate the horizontal glyph. '90deg' All glyphs are oriented with their tops toward the before edge of the block. BIDI reordering takes place. '270deg' All glyphs are oriented with their tops toward the after edge of the block. BIDI reordering takes place, but the directions are reversed; that is, left-to-right characters are treated as right-to-left characters and right-to-left characters are treated as left-to-right characters. 'natural' - Vertical and translating bi-orientational scripts are handled as for '0deg'. - Rotating bi-orientational scripts are oriented as either '90deg' or '270deg', depending on their vertical directionality. - All horizontal scripts are handled as for '90deg'. - If the element's dominant script is a vertical or bi-orientational script, available vertical glyph variants should be used for punctuation. Otherwise, horizontal glyph variants must be used, rotated to a '90deg' orientation. 'left' Same as 'natural' except horizontal text in a right-to-left block is handled as for '270deg' instead of '90deg'. 'right' Same as 'natural' except horizontal text in a left-to-right block is handled as for '270deg' instead of '90deg'. 'context' Vertical and bi-orientational scripts are handled as for 'natural'. If the element's dominant script is a vertical script, all punctuation is handled as '0deg'. The BIDI algorithm is applied, however reordering does not take place within the element. Instead, - In a top-to-bottom inline progression, all horizontal script characters in an even embedding level are rotated 90 degrees clockwise, and all horizontal script characters in an odd embedding level are rotated 90 degrees counterclockwise. - In a bottom-to-top inline progression, all horizontal script characters in an even embedding level are rotated 90 degrees /counterclockwise/, and all horizontal script characters in an odd embedding level are rotated 90 degrees clockwise. Any other values are illegal and must be *ignored*, if possible. http://www.w3.org/TR/REC-CSS2/syndata.html#ignore --------------------------------------------------------------------------------- Notes on BIDI: The Unicode Bi-Directional Algorithm is applied to the whole text block not individual pieces of it. 'text-orientation' is defined so that characters adopt an appropriate direction within the context of the block. This is different from SVG, where each change in glyph orientation ended the BIDI algorithm's block. (SVG's approach would fail to handle, for example, a run of upright Latin in a 90deg- rotated vertical Arabic sentence.) The Unicode BIDI algorithm is a carefully designed algorithm for laying out text of mixed directionality. Since rotating text can effectively change its directionality, it makes sense to leverage this algorithm for mixed-orientation layouts as well. Embeddings: p { text-orientation-vertical: context } x { direction: rtl; unicode-bidi: embed; } character stream: <p>CHINESE <x>ARABIC 1234</x> CHINESE</p> tb block text: <p>LLLLLLL <x>RRRRRR LLLL</x> LLLLLLL</p> lr block text: <p>UUUUUUU <x>RRRRRR LLLL</x> UUUUUUU</p> In the lr block, all text will flow top to bottom. However, Us will be upright, Rs will be rotated 90 degrees counter clockwise and Ls will be rotated 90 degrees clockwise. Everything reads in the right direction, but... we need to do a 180 do read the Arabic date, which is awkward. It would be better to have the number rotated the same way as the Arabic, and just read from bottom to top instead of top to bottom for this little bit. We have a clue to help us determine when to do this: the embedding. (The text above would not order correctly even in a tb block unless "ARABIC 1234" was embedded.) Therefore, we can add the following rule: * For elements that have a 'context' text-orientation, a 'unicode-bidi' * value of 'embed' will cause BIDI reordering to take effect. Specifically, * If 'direction' is 'ltr' the element will behave as if it had * a 'right' text-orientation. * If 'direction' is 'rtl' the element will behave as if it had * a 'left' text-orientation. * The value of 'text-orientation-vertical' will still inherit as * 'context'. p { text-orientation-vertical: natural } character stream: <p>MONGOLIAN "english1 MG english2" MONGOLIAN</p> tb block text: <p>LLLLLLLLL "LLLLLLLL LL LLLLLLLL" LLLLLLLLL</p> lr block text: <p>RRRRRRRRR "LLLLLLLL RR LLLLLLLL" RRRRRRRRR</p> For this text to display correctly, the quote needs to be embedded. This cannot be automated, since a similar text could have 'english1' and 'english2' be discrete phrases. Also, this complexity of script mixing is rare, so it is reasonable to expect that the author will either set "unicode-bidi: embed" on an element around the English quote or choose "text-orientation-vertical: context" (which eliminates the problem by orienting all the text to flow from top to bottom). p,x { text-orientation-vertical: 0deg } a { text-orientation-vertical: natural } character stream: <p>LATIN1 <a>morelatin1 <x>LATIN</x> morelatin2</a> LATIN2</p> tb block text: <p>LLLLLL <a>LLLLLLLLLL <x>LLLLL</x> LLLLLLLLLL</a> LLLLLL</p> lr block text: <p>RRRRRR <a>LLLLLLLLLL <x>RRRRR</x> LLLLLLLLLL</a> RRRRRR</p> This presents more of a problem. There are no script differences, so the need for an embedding is not obvious. Moreover, since the change in directional behavior is directly caused by a style change, it is possible for the UA to handle any necessary embeddings. Should it? ---------------------------------------------------------------------- Horizontal Text: Michel noted that I neglected horizontal text in my writeup. This was intentional; the title, after all, was "BIDI in Vertical Context". :) I'm not as clear on what happens in horizontal text, particularly wrt punctuation. But, of course, horizontal text needs to be addressed as well. Pretty much any script in the current Unicode repertoire comfortably fits into a horizontal line layout. Chinese, Japanese, Korean, and Yi all switch between horizontal and vertical lines without rotation. An exception is Mongolian, which by the cursive nature of its script, cannot be laid out horizontally glyph-by-glyph. It has to be rotated, 90 degrees either way. It often runs from left-to-right, but might sometimes run right-to-left. (I think in Arabic or Hebrew contexts this would be the preferred choice, and in some cases maybe even when by itself, if vertical layout were not available, as this orientation results in a layout simply rotated from the original instead of rotated /and/ inverted.) 'text-orientation-horizontal' is mostly analogous to 'text-orientation- vertical'. It takes the following values: '0deg' All glyphs are oriented with their tops toward the block's top. BIDI reordering takes effect. '180deg' All glyphs are oriented with their tops toward the block's bottom. BIDI reordering takes effect, but characters' directionalities are reversed. '90deg' All glyphs are oriented with their tops toward the block's right edge. Characters are laid out from right to left. There is no reordering within the element. '270deg' All glyphs are oriented with their tops toward the block's left edge. Characters are laid out from left to right. There is no reordering within the element. 'natural' Horizontal and bi-orientational scripts are handled as for '0deg'. Vertical scripts assigned left-to-right directionality are handled as for '270deg'. Vertical scripts assigned right-to-left directionality are handled as for '90deg'. If the element's dominant script is a vertical script, available vertical glyph variants should be used for punctuation, rotated appropriately. Otherwise, horizontal glyph variants must be used, kept at '0deg'. 'left' As for 'natural' except vertical scripts are always handled as for '270deg'. 'right' As for 'natural' except vertical scripts are always handled as for '90deg'. 'context' All horizontal and bi-orientational scripts are handled as for '0deg'. Directional characters in scripts without a definite horizontal directionality inherit the block's inline progression direction. So, if the block's inline progression goes from left to right, then vertical scripts are handled as for '270deg'. Otherwise, vertical scripts are handled as for '90deg'. If the element's dominant script is a vertical script, available vertical glyph variants should be used for punctuation, rotated appropriately. Otherwise, horizontal glyph variants must be used, kept at '0deg'. --------------------------------------------------------------------------- Notes on text-orientation values: Other, less common scripts may take different orientations when laid out. Some variants of classical Yi, for example, are laid out top-to-bottom, left-to-right and rotate to become right-to-left scripts when placed in horizontal line layout. If a page uses special fonts to display classical Yi--or Pictish Ogham, or anything else rare and unusual, it will need to use explicit text-orientation values. This is why we need some values that override normal behavior. It is possible instead to redefine 'left' and 'right' to affect all characters, not just horizontal scripts in vertical layout and vertical scripts in horizontal layout. --------------------------------------------------------------------------- Determining the Inline Progression of a Block: Some of the "automatic" values for text-orientation need to know the block's inline progression direction. I've already said that this is not the same as the text's directionality, which is given by the 'direction' property. So what is it? The inline progression of a block can be determined from its block- progression, text-orientation, and direction values. Because 'direction' only specifies horizontal directionality, it may be necessary to look up directionality based on the block's dominant script (as given by 'text-script'). Here's the table for the automatic values as applied to horizontal scripts, and the corresponding paragraph-level direction that will be used in the BIDI algorithm: (The inline progession for other values should be fairly obvious.) block-progression text-orientation direction | inline-progression ----------------- ---------------- ----------- | ------------------ top-to-bottom natural ltr | left-to-right (L) top-to-bottom natural rtl | right-to-left (R) left-to-right natural ltr | bottom-to-top (L) left-to-right natural rtl | top-to-bottom (R) right-to-left natural ltr | top-to-bottom (L) right-to-left natural rtl | bottom-to-top (R) top-to-bottom context ltr | left-to-right (L) top-to-bottom context rtl | right-to-left (R) left-to-right context ltr | top-to-bottom (R) left-to-right context rtl | top-to-bottom (R) right-to-left context ltr | top-to-bottom (L) right-to-left context rtl | top-to-bottom (L) 'context' is the only style in which the glyph orientation depends on the inline progression rather than the other way around. Thus it is the only style which does not intrinsically require a certain inline progression. You'll notice, however, that it always chooses top-to-bottom for a vertical flow block. This is because horizontal scripts have a bias for going from top to bottom. (It /is/ their secondary direction, after all.) Since 'direction' only gives the horizontal directionality, it is necessary to look up the dominant script (as given by 'text-script') to determine the block's inline progression for 'lr' and 'rl' block progressions. Most East Asian scripts - are bi-orientational (They behave as horizontal scripts in horizontal lines and as vertical scripts in vertical lines.) - read from top to bottom in vertical lines - use a translational transform between orientations rather than a rotation Of the list in UAX 24, these scripts include Han, Hangul, Bopomofo, Katakana, Hiragana, and Yi. As the dominant script of a vertical flow block, they give the block a top-to-bottom inline progression for both 'natural' and 'context' text orientations. Mongolian - is a vertical script - reads from top to bottom in vertical lines As with the other top-to-bottom East Asian scripts, 'natural' and 'context' text orientations with Mongolian result in a top-to-bottom inline progression. Ogham - is bi-orientational - reads from bottom to top in vertical lines - uses a *rotational* transform between orientations As the dominant script of a vertical flow block, it gives the block a **bottom-to-top** inline progression for both 'natural' and 'context'. If I am not mistaken, all other scripts in UAX 24 should be classified as horizontal. (I don't know about Canadian Aboriginal, though.) ------------------------------------------------------------------------------ Note on Vertical Scripts and Font Systems: On font systems where fonts for vertical scripts are designed for horizontal layout (i.e. unrotated glyphs are sideways), the actual picture in the font will be oriented upright or upside-down in horizontal layout and use the font's horizontal metrics. In vertical layout it will be oriented sideways. In either case, the *form* of the glyph follows the rules outlined above--sideways in horizontal layout and upright in vertical layout. ------------------------------------------------------------------------------ text-orientation vs. glyph-orientation We now have a system defined with 'block-progression', 'text-orientation', and 'direction'. What about 'glyph-orientation'? Michel suggested that the glyph-orientation properties should take the role of 'text-orientation' by adding values for 'upright' and 'inline' ("context") and letting 'auto' fill in for "natural". However, the definitions for text-orientation and glyph-orientation don't quite match. (There's also the fact that having a property named "glyph-orientation" reorder content instead of just rotating glyphs is IMO just not intuitive.) The problem with glyph-orientation is that it does not reverse characters' directionality when it reverses their glyphs. The advantage of glyph-orientation is that it does not reverse directionality when it reverses glyphs. It can be used for decorative purposes without adversely affecting the character order of the text. So, it can be used in combination with text-orientation for - weird graphical effects - decorative tilts - displaying non-standardized scripts - anything else that requires more precise control over glyph layout ---------------------------------------------------------------------------- text-orientation and glyph-orientation Some rules for the interaction of glyph-orientation and text-orientation: 1. If glyph-orientation is 'auto', the glyph's orientation is determined by the text-orientation value. 2. If text-orientation is 'natural', 'glyph-orientation' affects the text order as specified in SVG 1.1. 3. Otherwise, the glyph-orientation value gives the exact orientation of all glyphs in the element and has no effect on character order. ----------------------------------------------------------------------------- Mirroring Scripts ----------------- Ancient Egyptian hieroglyphs could be written in lines going either from right to left or left to right. A distinctive characteristic of the lines was that the glyphs *faced* the beginning of the line. IIRC, Egyptian is not the only script to behave this way, and it would be best to handle such scripts by allowing CSS to style the direction and mirror the glyphs. 'text-orientation-horizontal: mirror' As for 'natural', except directionality is reversed and each glyph is mirrored across its vertical central axis. 'glyph-orientation-horizontal: mirror' Each glyph is mirrored across its vertical central axis. We can also add to 'context' - If the inline progression is right to left, mirroring scripts are handled as 'mirror'. ----------------------------------------------------------------------------- 'writing-mode' As I have explained before[1], using 'direction' to control the inline progression can result in strange text displays. 'writing-mode', as it's currently defined, also sets 'direction'--which will be a problem if 'writing-mode' comes into common use. Most of the time, the author using "writing-mode: tb-rl" doesn't want to change the 'direction', just the block progression. Therefore, "writing-mode: tb-rl" /shouldn't/ change the block progression. Since it says "top to bottom, right to left", we can have this shorthand expand to block-progression: rl; text-orientation: context which will do what the author wants and not affect 'direction'. ------------------------------------------------------------------------------ Appendix A: Partial Table of Script Classifications (Informative) Directionality Vertical Horizontal Transform ----------------------------------------------------------------------------- Latin none ltr -- Cyrillic none ltr -- Greek none ltr -- Arabic none rtl -- Hebrew none rtl -- Devanagari none ltr -- Tibetan none ltr -- Thai none ltr -- Han tb ltr translate Hangul tb ltr translate Hiragana tb ltr translate Katakana tb ltr translate Yi tb ltr translate Hanunoo none ltr -- Mongolian tb none (ltr) -- Ogham bt ltr rotate HTML version and references provided upon request. Acknowledgements: Thanks once again to Martin Heijdra for taking time out of his busy day to discuss scripts and layout with me. :) [1] http://lists.w3.org/Archives/Public/www-style/2003Apr/0045.htm ~fantasai
Received on Monday, 7 April 2003 11:02:21 UTC