W3C home > Mailing lists > Public > www-international@w3.org > July to September 2005

Re: bidi discussion list was: Bidi Markup vs Unicode control characters

From: Stephen Deach <sdeach@adobe.com>
Date: Wed, 10 Aug 2005 08:24:27 -0700
To: Tex Texin <tex@xencraft.com>, Ognyan Kulev <ogi@fmi.uni-sofia.bg>
Cc: Stephen Deach <sdeach@adobe.com>, Addison Phillips <addison.phillips@quest.com>, www-international@w3.org, Richard Ishida <ishida@w3.org>, Bert Bos <bert@w3.org>
Message-id: <>

   The existing directional markers in Unicode are purely to control the 
bidi algorithn. They do not provide the ability to control the cell-order 
in table layout or the baseline orientation and progression directions for 
tb-rl vs lr-tb writing modes in Japanese (or similar layout issues for 
Chinese, Korean, and other scripts that allow vertical or horizontal 
presentations). You must use the direction property in CSS/HTML (as 
extended in CSS3) or the reference-orientation and writing-mode properties 
in XSL-FO to fully control these layouts or at least to control the 
vertical vs horizontal choice (though once that choice is known, xml:lang 
can again be used to INFER appropriate inline-progression and 

   "xml:lang" only IMPLIES a preferred bidi direction, however for the 
purposes of the bidi processing this may be adequate. (I use xml:lang at 
the paragraph level to decide on the starting level for the bidi algorithm, 
rather than looking for the first character with strong-directionality. So 
far, I haven't found a case that fails and it is more accurate than relying 
on the first strong-directionality character.)
   The cases where one needs bidi-direction-overrides are generally 
coincident with the locations of semantic markup. (Otherwise you can't 
assign the direction/override properties via CSS.)
   Proper semantic markup requires proper marking of language and proper 
markup of nesting of spans. Properly done semantic nesting is also fully 
consistent with the nesting needs of the bidi algorithm and the needs for 
determining when to specifiy overrides to the bidi algorithm. (You are 
correct to point out one does not push a bidi-level for each level of 
semantic nesting. Assuming I have a paragraph which is Arabic, which begins 
with a quote in English and then continues after the English quote in 
Arabic. If I marked the paragraph as Arabic, quote as English, and the rest 
of the paragraph defaults back to the paragraph setting, I will get the 
correct results by starting the bidi-algorithm on an LTR level and 
optionally forcing an LRE...PDF on the English quote [the LRE is not 
actually needed if the English quote begins with a strong LTR character 
unless it, in turn, contains a nested Hebrew/Arabic quote]. The bidi 
algorithm will properly pick up any nested numbers and incidental foreign 
words, even if there is no markup on them, though one must be careful to 
mark list item labels, equations, and other "not natural-language" constructs.)

At 2005.08.10-01:11(-0700), Tex Texin wrote:
>I prefer nesting of xml elements to reflect the semantic relationship of
>the elements.
>That is not necessarily the same as the presentation relationships.
>Also the relationship between runs is not always to embed (or pop) a
>level. Sometimes there will be sibling relationships, which to maintain
>presentation ordering will need some sequencing attributes. (All in all,
>I think I prefer control codes for all of this. ;-) )
>Of course, at some point you can assign a default value to each language
>and use xml:lang, and insist the xml be created in the sequences and
>nesting that cause it to display correctly on accessibility devices, but
>in doing so, we would be writing visually rather than logically to some
>extent, which is counter to design goals.
>You might want to look at the bidi algorithm:
>Ognyan Kulev wrote:
> >
> > Tex Texin wrote:
> > > Ogi,
> > > I think the short answer is no. "Direction" has a number of components
> > > to it.
> > > Instead of thinking of an instance of a single text string, consider a
> > > series of text runs, with direction changes.
> >
> > I admit that I'm haven't read much about bidi.  I just try to talk with
> > common sense.
> >
> > > When the language changes, does the direction level increase, is it
> > > reduced, or is it a start of a new top-level direction setting? How does
> > > each run relate to the surrounding runs? (In terms of the bidi
> > > algorithm.)
> >
> > XML elements already imply nesting of text runs.  Isn't that enough?
> >
> > > Also, what is the layout direction, regardless of the language of each
> > > text run?
> > > For example, as you know in HTML, tables have a direction. Regardless of
> > > the language of each cell's contents, the placement of the cell (or
> > > column actually) is determined by the direction of the table.
> >
> > <table> can use xml:lang too.
> >
> > > Direction for a language can also be ambiguous. Chinese can be written
> > > lr-tb, rl-tb, tb-rl...
> > > (where l,r,t,b are left, right, top, bottom). The front page of some
> > > Chinese newspapers can use all 3 of those directions/layouts.
> >
> > When stylesheet is not used, you already give up on controlling layout
> > and you choose to use some default layout for elements.  So isn't enough
> > if one of these directions (lr-tb, ...) is chosen as default for
> > language?  I don't know if this is acceptable for Chinese.
> >
> > > So xml:lang might be suggestive, but it is not explicit or informative
> > > enough to base bidi layout upon it alone.
> >
> > My point is: isn't xml:lang enough for producing acceptable layout when
> > there is no stylesheet?
> >
> > Regards,
> > ogi
>Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
>Xen Master                          http://www.i18nGuy.com
>XenCraft                            http://www.XenCraft.com
>Making e-Business Work Around the World

---Steve Deach
Received on Wednesday, 10 August 2005 15:25:33 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:04:23 UTC