- From: <bugzilla@jessica.w3.org>
- Date: Tue, 19 Oct 2010 14:34:45 +0000
- To: public-i18n-bidi@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10808 --- Comment #16 from Aharon Lanin <aharon.lists.lanin@gmail.com> 2010-10-19 14:34:43 UTC --- (In reply to comment #12) > 1) Why would you ever want to not estimate the direction for each paragraph > separately? 1. Estimating the direction of each UBA paragraph separately has a price. 2. The use cases are limited to <textarea> and <pre>. Let's take a specific example: <div dir=auto> some ltr text. <div> SOME RTL TEXT. </div> SOME MORE RTL TEXT. </div> There are three UBA paragraphs here: the text before the internal div, the text inside it, and the text after it. What you want is to have the first displayed in LTR, and the others in RTL, and are puzzled why dir=auto is defined to give them all the same direction (for autodirmethod values other than plaintext). First, note that if the first and third UBA paragraphs contained mark-up that used the new CSS capabilities to depend on direction (e.g. text-align:start, margin-end, :rtl in the selector, etc.), you would want it to depend on the UBA paragraph's direction. However, the first and third UBA paragraphs are not separate elements. They therefore must have the same CSS direction value. Thus, having per-UBA-paragraph direction faces the unenviable choice of either divorcing the direction-dependent CSS from the CSS direction to the inaccessible UBA paragraph direction or having that CSS work inappropriately. This choice is the price that I do not want to pay. Now, the use cases. It is indeed possible to have multi-paragraph plain text that can only be rendered well by assigning each of its UBA paragraphs its own direction (as explicitly suggested by the UBA). However, such plain text is limited to <textarea> and <pre> elements. <textarea> does not allow mark-up at all, so the problem described above does not apply to it; <pre> is allowed to contain some mark-up, but being pre-formatted, it is not expected to contain the layout-modifying mark-up of the sort that bothers us. This is the use case for autodirmethod=plaintext, which does per-paragraph estimation like you want, but is not expected to handle well direction-dependent CSS within it. On the other hand, I do not see a use case for the dir=auto in the example above to automatically apply independently to the internal div. If the author wants auto-estimation on the internal div, let him put dir=auto on the internal div. For example, if you are embedding a piece of complicated HTML that you did not author in your page, and you do not know the direction in which this piece of HTML is supposed to be displayed, put a <div dir=auto> around that piece of HTML. If inside it there are smaller pieces that have a different direction, it was the job of the HTML's original author to indicate this within the HTML, e.g. with dir=auto elements around those smaller pieces. > 2) Does it really make sense to expose the first-strong vs. any-rtl distinction > to authors? Why not just pick whichever one seems better for the platform? The reason they exist is not to make it easier for the platform, but because different approaches give better results for different kinds of content. First-strong has a serious flaw: RTL text very often contains LTR words and phrases (e.g. acronyms and brand names) and even fairly often starts with them, e.g. "html IS A WONDERFUL PLATFORM". I therefore tend to prefer any-rtl for most cases. However, in an input box, first-strong does have the advantage of being easier for the user to surmise and control. Thus, I would say, if you have content you are obtaining via an input box, use first-strong (both on the input box and the elements that are then used to display those values). But if you are displaying text of unknown origin, any-rtl is a better bet. > In > particular, paragraphs are of unbounded length, and the browser might not have > access to the full paragraph before it starts rendering (since it might have > only received part of the page). > > any-rtl would force browsers to scan the whole paragraph before rendering, > which is bad. Or force them to flip directionality as the page is loading/as > the user types, which is worse. Which is why we are limiting any-rtl to scanning the first 100 characters of the element's content. Flips are still possible, but unlikely. BTW, flips are also still possible but unlikely for first-strong, since the element could start with an arbitrary amount of neutral content. > So first-strong is preferable. Ideally we'd > look beyond the first character, e.g., checking if the first 100 characters are > at least 30% RTL, but that doesn't work well when the user is typing the > content on the fly, since then direction will switch as he types. Better estimation algorithms can and will be invented. The reason we are currently only dealing with first-strong, any-rtl, and plaintext is that they are well-known, tried, and easily defined and implemented. If and when a much better algorithm is invented and proven, we want to be able to support it. That does not mean that existing content that was created with and works for an older estimation method should be potentially broken by applying the new estimation algorithm to it without being asked to do so. This is exactly why we have autodirmethod. We can extend the repertory of its values without making them the default for existing content. > I think that when this behavior is defined, we should evaluate where to > activate it by default. IMO, it would be a big win if this were enabled by > default on all textareas and inputs, at least. I wonder if it would really > break anything much if it were the default on all elements. Probably, but > maybe worth trying . . . I tend to agree, but not everyone does. A discussion worth having, although it would have been better if it had already taken place in public-i18n-bidi before the bugs were filed on HTML5. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. You reported the bug.
Received on Tuesday, 19 October 2010 14:34:47 UTC