- From: Alan Gresley <alan@css-class.com>
- Date: Thu, 16 Dec 2010 16:01:51 +1100
- To: "Aharon (Vladimir) Lanin" <aharon@google.com>
- CC: W3C style mailing list <www-style@w3.org>, fantasai <fantasai.lists@inkedblade.net>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
On 16/12/2010 9:11 AM, Aharon (Vladimir) Lanin wrote: Adding my 2 cents worth. I slowly understanding the concept of bi-directionally. I have trouble since I can only read and write English. > Currently, the CSS Writing Modes Module Level 3 spec on text > direction<http://dev.w3.org/csswg/css3-writing-modes/#text-direction> > states: > > "User agents that support bidirectional text must apply the Unicode > bidirectional algorithm to every sequence of inline boxes uninterrupted by a > forced (bidi class B) line break or block boundary. I think this is referring to a class B line break (whatever that is). <br/> seem to come at 3.4 (Reordering Resolved Levels) [1] and what is called Paragraph separators. > This sequence forms the > "paragraph" unit in the bidirectional algorithm. The paragraph embedding > level is set according to the value of the ‘direction’ property of the > containing block rather than by the heuristic given in steps P2 and P3 of > the Unicode algorithm." > > Further down in the same major section, the definition of > unicode-bidi:plaintext<http://dev.w3.org/csswg/css3-writing-modes/#unicode-bidi> > states: > > "For the purposes of the Unicode bidirectional algorithm, the base > directionality of each "paragraph" for which the element is the containing > block element is determined not by the element's computed ‘direction’ as > usual, but by following rules P1, P2, and P3 of the Unicode bidirectional > algorithm." Above I see "which the element." I have know idea what element is being referred to here. This paragraph also seems to suggest an added meaning of a containing block. What is a containing block element? > I think that these parts of the spec needs to be tweaked in several > respects: > > 1. There is no reason to mention rule P1 when describing how > unicode-bidi:plaintext affects the base directionality of each paragraph. P1 > deals with how the text is split up into paragraphs, not with the direction > of each paragraph, and applies to all content, regardless > of unicode-bidi:plaintext. > > 2. I think it would improve clarity to mention the unicode-bidi:plaintext > exception when first describing how the paragraph embedding level is set > (first quote above). Thus, the last sentence of the first quote should read: > > "The paragraph embedding level is set according to the value of the > ‘direction’ property of the containing block, unless the containing block > element has unicode-bidi:plaintext, in which case it is set according to the > heuristic given in steps P2 and P3 of the Unicode algorithm." > > 3. We must probably explicitly define the effect of a paragraph break (i.e. > a block boundary or bidi class B line break, which in HTML5 includes<br>) > when the path from the containing block element to the paragraph break > includes elements with a unicode-bidi value other than "normal". For > example, what happens when we have (as usual, uppercase English is used > instead of RTL characters) : > > <div dir=ltr> > <span dir=rtl> > TO BE<br> > OR NOT TO BE? > </span> > -- hamlet, in rtl translation. > </div> > > Should the "OR NOT TO BE?" be displayed in rtl ("?EB OT TON RO") or in ltr > ("EB OT TON RO?")? That believe this depends on the value of unicode-bidi. I am somewhat confused myself since the default behavior in an offline test, <!DOCTYPE html> <div dir=ltr> <span dir=rtl> TO BE<br> OR NOT TO BE? </span> <div>-- hamlet, in rtl translation.</div> </div> in FF 3.6.13 renders as embed where the initial value for unicode-bidi is normal. unicode-bidi: embed, isolate and plaintext produces this. ?OR NOT TO BE unicode-bidi: normal produces this. OR NOT TO BE? unicode-bidi: bidi-override produces this. ?EB OT TON RO I have not tested in other browser since I am ignorant if FF even does it correctly. > While it seems obvious that it should be displayed in RTL because it is part > of a<span dir=rtl>, that is not the result if we simply translate the above > into Unicode bidi formatting characters, i.e. > > [RLE]TO BE > OR NOT TO BE?[PDF] -- hamlet, in rtl translation. The direction does not affect the embedding algorithm of a particular script. The direction changes where the start and end is for a sequence of inline boxes. The placement of punctuation marks (.,;?!`), makers for list (with value of outside) is changed due to direction. > The overall direction of both paragraphs is ltr (P2 and P3 are overridden), > and since the paragraph break resets all embedding levels, the [PDF] is > orphaned, and the question mark winds up to the right of "EB OT TON RO". > > I believe that the correct approach to take is to treat the second bidi > paragraph (i.e. "TO BE ... translation.") the same as: > > <div dir=ltr> > <span dir=rtl> > OR NOT TO BE? > </span> > -- hamlet, in rtl translation. > </div> > > In other words, while the paragraph's overall level should be set according > to the value of the ‘direction’ property of the containing block (ltr), it > should be opened by repeating the embeddings or overrides introduced by the > elements between the paragraph break and the containing block - in our > example, the equivalent of an RLE (which is then matched by the</span>'s > PDF equivalent). > > This is similar to the CSS specs for anonymous block > boxes<http://www.w3.org/TR/2009/CR-CSS2-20090908/visuren.html#anonymous-block-level>, > i.e: > > "When an inline box contains a block box, the inline box (and its inline > ancestors within the same line box) are broken around the block. The line > boxes before the break and after the break are enclosed in anonymous boxes, > and the block box becomes a sibling of those anonymous boxes. When such an > inline box is affected by relative positioning, the relative positioning > also affects the block box." > > "The properties of anonymous boxes are inherited from the enclosing > non-anonymous box". > > Does a line break does result in anonymous boxes? If not, we certainly need > something in the Writing Modes spec. Actually, it would be good to have it > either anyway, just to clarify things. > > 4. When the path from the containing block element to the paragraph break > includes an element with unicode-bidi:isolate, there is no reason to go back > all the way to the containing block element to get the new paragraph's base > direction and the embeddings to be reconstituted at its start. Instead of > referring to the containing block element, the spec should be referring to > the closest unicode-bidi:isolate ancestor or containing block element, > whichever is closer. > > Aharon I believe the spec needs quite a few illustrations. If an author is given a job where there are runs of LTR and RTL text and they only understand one language, the spec as it is is not going to help. I also believe that the spec should give particular examples of foreign script of words that can easy be recognized. My use of ᠨᠶᠪᠧᠺᠴᡗ here [2] does not help me. Only with research did I figure that is ran LTR, 1. <http://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels> 2. <http://css-class.com/test/css/bidi/mongolian-test1-extra.htm> -- Alan http://css-class.com/ Armies Cannot Stop An Idea Whose Time Has Come. - Victor Hugo
Received on Thursday, 16 December 2010 05:02:29 UTC