- From: Stephen Deach <sdeach@adobe.com>
- Date: Wed, 10 Aug 2005 08:24:27 -0700
- To: Tex Texin <tex@xencraft.com>, Ognyan Kulev <ogi@fmi.uni-sofia.bg>
- Cc: Stephen Deach <sdeach@adobe.com>, Addison Phillips <addison.phillips@quest.com>, www-international@w3.org, Richard Ishida <ishida@w3.org>, Bert Bos <bert@w3.org>
The existing directional markers in Unicode are purely to control the bidi algorithn. They do not provide the ability to control the cell-order in table layout or the baseline orientation and progression directions for tb-rl vs lr-tb writing modes in Japanese (or similar layout issues for Chinese, Korean, and other scripts that allow vertical or horizontal presentations). You must use the direction property in CSS/HTML (as extended in CSS3) or the reference-orientation and writing-mode properties in XSL-FO to fully control these layouts or at least to control the vertical vs horizontal choice (though once that choice is known, xml:lang can again be used to INFER appropriate inline-progression and block-progression). "xml:lang" only IMPLIES a preferred bidi direction, however for the purposes of the bidi processing this may be adequate. (I use xml:lang at the paragraph level to decide on the starting level for the bidi algorithm, rather than looking for the first character with strong-directionality. So far, I haven't found a case that fails and it is more accurate than relying on the first strong-directionality character.) The cases where one needs bidi-direction-overrides are generally coincident with the locations of semantic markup. (Otherwise you can't assign the direction/override properties via CSS.) Proper semantic markup requires proper marking of language and proper markup of nesting of spans. Properly done semantic nesting is also fully consistent with the nesting needs of the bidi algorithm and the needs for determining when to specifiy overrides to the bidi algorithm. (You are correct to point out one does not push a bidi-level for each level of semantic nesting. Assuming I have a paragraph which is Arabic, which begins with a quote in English and then continues after the English quote in Arabic. If I marked the paragraph as Arabic, quote as English, and the rest of the paragraph defaults back to the paragraph setting, I will get the correct results by starting the bidi-algorithm on an LTR level and optionally forcing an LRE...PDF on the English quote [the LRE is not actually needed if the English quote begins with a strong LTR character unless it, in turn, contains a nested Hebrew/Arabic quote]. The bidi algorithm will properly pick up any nested numbers and incidental foreign words, even if there is no markup on them, though one must be careful to mark list item labels, equations, and other "not natural-language" constructs.) At 2005.08.10-01:11(-0700), Tex Texin wrote: >I prefer nesting of xml elements to reflect the semantic relationship of >the elements. >That is not necessarily the same as the presentation relationships. > >Also the relationship between runs is not always to embed (or pop) a >level. Sometimes there will be sibling relationships, which to maintain >presentation ordering will need some sequencing attributes. (All in all, >I think I prefer control codes for all of this. ;-) ) > >Of course, at some point you can assign a default value to each language >and use xml:lang, and insist the xml be created in the sequences and >nesting that cause it to display correctly on accessibility devices, but >in doing so, we would be writing visually rather than logically to some >extent, which is counter to design goals. > >You might want to look at the bidi algorithm: >http://www.unicode.org/reports/tr9/ > >tex > >Ognyan Kulev wrote: > > > > Tex Texin wrote: > > > Ogi, > > > I think the short answer is no. "Direction" has a number of components > > > to it. > > > Instead of thinking of an instance of a single text string, consider a > > > series of text runs, with direction changes. > > > > I admit that I'm haven't read much about bidi. I just try to talk with > > common sense. > > > > > When the language changes, does the direction level increase, is it > > > reduced, or is it a start of a new top-level direction setting? How does > > > each run relate to the surrounding runs? (In terms of the bidi > > > algorithm.) > > > > XML elements already imply nesting of text runs. Isn't that enough? > > > > > Also, what is the layout direction, regardless of the language of each > > > text run? > > > For example, as you know in HTML, tables have a direction. Regardless of > > > the language of each cell's contents, the placement of the cell (or > > > column actually) is determined by the direction of the table. > > > > <table> can use xml:lang too. > > > > > Direction for a language can also be ambiguous. Chinese can be written > > > lr-tb, rl-tb, tb-rl... > > > (where l,r,t,b are left, right, top, bottom). The front page of some > > > Chinese newspapers can use all 3 of those directions/layouts. > > > > When stylesheet is not used, you already give up on controlling layout > > and you choose to use some default layout for elements. So isn't enough > > if one of these directions (lr-tb, ...) is chosen as default for > > language? I don't know if this is acceptable for Chinese. > > > > > So xml:lang might be suggestive, but it is not explicit or informative > > > enough to base bidi layout upon it alone. > > > > My point is: isn't xml:lang enough for producing acceptable layout when > > there is no stylesheet? > > > > Regards, > > ogi > >-- >------------------------------------------------------------- >Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com >Xen Master http://www.i18nGuy.com > >XenCraft http://www.XenCraft.com >Making e-Business Work Around the World >------------------------------------------------------------- ---Steve Deach sdeach@adobe.com
Received on Wednesday, 10 August 2005 15:25:33 UTC