- From: <bugzilla@jessica.w3.org>
- Date: Thu, 05 May 2011 08:10:03 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11829 --- Comment #8 from Aharon Lanin <aharon.lists.lanin@gmail.com> 2011-05-05 08:10:02 UTC --- (In reply to comment #7) Thinking through all this again, I now see that the problem is even bigger than I originally realized. The excerpts quoted in comment 7 are taken from descriptions of the <bdo> element and the <br> element. The comment also refers to specs on the <bdi> element. *None of these* apply to the elements that used to be called block elements in HTML4. *At no point does the current HTML spec define the role of these elements in breaking the text of the document into paragraphs for the purposes of the Unicode Bidirectional Algorithm.* The role of <br> (and newlines in <textarea>) in this matter is now defined (in the sections quoted above), but the role of a <div>, <p>, etc. is not. Paragraph breaks have a crucial effect on bidi text, so the absence of a complete spec for them in HTML is highly problematic. Nor does the spec how these elements affect the bidirectional algorithm when their "block" nature is modified by something outside HTML, e.g. a stylesheet. In HTML4 there is an attempt at such a spec in <http://www.w3.org/TR/html401/struct/dirlang.html#h-8.2>. Here are the most relevant sections: "The Unicode bidirectional algorithm requires a base text direction for text blocks. To specify the base direction of a block-level element, set the element's dir attribute." "When an inline element that does not have a dir attribute is transformed to the style of a block-level element by a style sheet, it inherits the dir attribute from its closest parent block element to define the base direction of the block." "When a block element that does not have a dir attribute is transformed to the style of an inline element by a style sheet, the resulting presentation should be equivalent, in terms of bidirectional formatting, to the formatting obtained by explicitly adding a dir attribute (assigned the inherited value) to the transformed element." This part of the HTML4 spec was far from perfect, but at least it was something. As far as I know, it is completely missing in the current spec, with the single exception of the unicode-bidi:isolate on the what-used-to-be-called-block elements in the default stylesheet. The bidi paragraph role of the blocks is currently only completely spelled out in the CSS spec. Thus, in the CSS3 Writing Modes module, we have: - User agents that support bidirectional text must apply the Unicode bidirectional algorithm to every sequence of inline boxes uninterrupted by a forced (bidi class B) paragraph break or block boundary. This sequence forms the paragraph unit in the bidirectional algorithm. - If [an inline element has unicode-bidi:embed, it] opens an additional level of embedding with respect to the bidirectional algorithm. The direction of this embedding level is given by the ‘direction’ property. Inside the element, reordering is done implicitly. This corresponds to adding a LRE (U+202A), for ‘direction: ltr’, or RLE (U+202B), for ‘direction: rtl’, at the start of the element and a PDF (U+202C) at the end of the element. - If an inline element is broken around a bidi paragraph boundary (e.g. if split by a block or forced paragraph break), then the bidi control codes corresponding to the end of the element are added before the interruption and the codes corresponding to the start of the element are added after it. (In other words, any embedding levels or overrides started by the element are closed at the paragraph break and reopened on the other side of it.) I believe that the role of what-used-to-be-called-block elements in determining the bidi paragraphs needs to be spelled out in the HTML5 spec, for the sake of user agents that do not implement CSS, and for the sake of clarity. By this role, I mean: - Normally, a "block" element's boundaries form paragraph boundaries for the purposes of the bidirectional algorithm. Thus, in <div>A<div>B</div>C</div>, each of A, B, and C are, by default, separate UBA paragraphs. The base direction of these paragraphs is specified by their respective containing elements' directionality. - When, by some extra-HTML mechanism (e.g. "display:inline" or "display:inline-block" in a stylesheet), a "block" element ceases to act as a block boundary between the content preceding and following it, user agents should treat it for the purposes of the bidirectional algorithm exactly as specified for the <bdi> element, except that its directionality must default to that of its parent. - When the dir attribute is present on an element that does not act as a block boundary, it opens an additional level of embedding with respect to the bidirectional algorithm. That is, for the purposes of the bidirectional algorithm, the user agent must act as if there was a U+202A LEFT-TO-RIGHT EMBEDDING character at the start of such an element with ltr directionality, a U+202B RIGHT-TO-LEFT EMBEDDING character at the start of such an element with rtl directionality, and a U+202C POP DIRECTIONAL FORMATTING at the end of such an element. - When an element that does not act as a block boundary is interrupted by a bidi paragraph boundary (e.g. contains a "block" element or <br>), then the bidi control codes, if any, corresponding to the end of the element are added before the interruption and the codes, if any, corresponding to the start of the element are added after after the interruption. (In other words, any embedding levels or overrides started by the element are closed at the paragraph break and reopened on the other side of it.) -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 5 May 2011 08:10:05 UTC