- From: Eric Muller <emuller@adobe.com>
- Date: Mon, 19 Mar 2012 16:27:47 -0700
- To: "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
- CC: Stephen Zilles <szilles@adobe.com>
This is a thought experiment, and I am mostly trying to get the big picture. I may have some details wrong, please bear with me. What has been bothering me from the start and motivated my first question is that <br/> is presented as a paragraph boundary for bidi, but then there is this reopening behavior that somewhat negates the boundary. Let's try to look at it the other way, i.e. to understand <br/> as not being a paragraph boundary, but rather as something that has an effect on bidi, much like LRM/RLM. More precisely: suppose we can introduce a new bidi control character x, which is not a bidi paragraph boundary; what should be the characteristics of x such that <br/> can be defined, for bidi purposes, as having the same effect as x? If we squint our eyes a little bit and ignore for one second the exact details of the reopening, <br/> is really not a boundary at all for the application of steps X1-X10 (explicit levels) of bidi because of the reopening. In other words, for the input <div> ..A.. <br/> ..B.. </div> whether you apply X1-X10 to "..A.." as one paragraph, and then to "..B.." as a separate paragraph *augmented* with all the reopening, you will get the same answer as applying X1-X10 to "..A.. ..B.." as a single paragraph. X1-X10 work only on the embedding and override characters, and the reopening is really reestablishing at beginning of processing ..B.. the state of that processing after having processed ...A... This carries over to ..A.. x ..B.., since x would not be an embedding or an override or PDF. On steps W1-W7 (weak types): as two paragraphs, we really have something of the style ..A.. eor and sor ..B.. At least intuitively, we would get the same answer if we processed something like ..A.. x ..B.., where x has bidi class L or R, the one that matches the directionality of the paragraphs (which are the same by construction). The same seems to work just as well for N1-N2 (neutral types). For steps I1-I2, nothing is going to happen differently to ..A.. or ..B... Finally, we get to the reordering. But since reordering is done within a line, and <br/> also has the effect of creating a line break, ..A.. and ..B.. will be treated separately anyway. x does not really have an impact. If we put together everything, it seems that the bidi effect of <br/> is simply that of a mark of the directionality of the paragraph. --- Rings a bell? The proposed LEVEL DIRECTION MARK of Unicode Public Review Issue 205 [1]? --- If the reasoning holds, I think there are interesting implications. First, it would be much easier to describe the bidi effects of <br/> using something like LDM, than by the current wording. Second, there is a certain attraction to being able to define the bidi effects of <br/> in a way that is similar to the definition of unicode-bidi:embed and unicode-bidi:override (i.e. by the transformation to characters). Third, it could be prudent to change the current definition to include the bidi override and embedding characters in the reopening. One possible scenario is that CSS does not want to wait for the inclusion of LDM in Unicode, but would be compatible with that introduction. Fourth, there is the synergy between the uses cases that motivated LDM and HTML/CSS. May be LDM would be good for HTML in general, not just for <br/>. I think those implications warrant a closer look at the possibility of having something like LDM, and of defining <br/> accordingly. Eric. [1] http://www.unicode.org/review/pri205/
Received on Monday, 19 March 2012 23:28:16 UTC