- From: <bugzilla@jessica.w3.org>
- Date: Thu, 04 Nov 2010 21:19:50 +0000
- To: public-i18n-bidi@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10809 --- Comment #39 from Aharon Lanin <aharon.lists.lanin@gmail.com> 2010-11-04 21:19:40 UTC --- (In reply to comment #38) > (In reply to comment #31) > > 1. It is very easy for LRE, RLE, LRO, RLO, and PDF [...] > > to get out of balance [...]. > > We should definitely make them affect the validity if it's a concern that > people will use them incorrectly and could benefit from validator tools > flagging these problems. Please file a bug suggesting this if you think it > would help. Will do. > Anyway, I can see the appeal (in terms of simplicity) of out-of-band direction > indication. I'll look into the feasability of just having a boolean attribute > on <input> and <textarea> that results in a separate field in the submission. Thank you. If that is the bottom line, you can ignore my answers below to the stuff that preceded this. > > 2. Similarly, these characters, while being perfectly balanced on their own, > > can very, very easily become "entangled" between the scopes of the document's > > tags. For example, what exactly is the browser to make of <span dir=rtl> ... > > [LRE] ... </span> ... [PDF]? > > What should happen is defined by CSS, which defines all of the bidi formatting > rules in terms of bidi formatting characters. If so, the text between the </span> and the PDF will come out RTL, since the </span> is equivalent to a PDF, which would be interpreted by the UBA to match the LRE, thus closing it, and reverting to the RTL direction defined by the <span dir=rtl>. How much sense does that make - the <span dir=rtl> was supposed to end with the </span>, and the bidi formatting character was LRE, not RLE! If one had equivalently entangled end tags of elements, e.g. <i>A<b>B</i>C</b>, most browsers will attempt to display it the way the user intended it - with the C bold, not italic. I am not saying your interpretation of what should happen is bad, only that there is no good interpretation of this mess. > > 3. Speaking of CSS, how exactly should the formatting characters - if > > encouraged - interact with the direction-dependent CSS, e.g. text-align:start? > > For example, consider: > > > > [RLE]<div style="text-align:start">blah blah</div>[PDF] > > > > Should the direction CSS property be rtl for the div? Should it be aligned to > > the right? What if the [RLE] and [PDF] were inside the div? > > The meaning of 'start' is entirely based on the 'direction' property and > nothing else. This is all defined in the CSS spec. I know. The point is that the formatting characters will not have any effect on the CSS - and that effect is vital if you want things to work well. I am just trying to demonstrate why in HTML you need to use mark-up (dir=), not the bidi formatting characters. > > However, the fact remains that in many cases, > > opposite-direction text gathered from the user is best displayed aligned to its > > start edge. So, to get that, I will still need to make the *div* say dir=rtl, > > and not leave it up to the text inside the div. > > Why not just use dir=auto? If the first character is a bidi formatting > character, that'll work as intended, no? 1. Deciding it is RTL simply because the first character is RLE is definitely wrong: consider "[RLE]JOE[PDF] likes to eat." It is an English sentence, LTR, not RTL. In RTL, it would be displayed as ".likes to eat EOJ" instead of the correct "EOJ likes to eat." 2. Unfortunately, the standard UBA algorithm (first-strong) ignores formatting characters. We would have to twiddle with it a little to make it support them (e.g. ignore the stuff inside them too, except for the case when the whole string is wrapped in them, in which case return the direction they indicate). > > And in order to do that after > > the browser has stuck the formatting characters into text (because the user > > entering it indicated its direction), the server side of my app will need to > > parse the text in order to figure out that indeed it is wrapped in formatting > > characters. And when I say "parse", I really mean parse: while the formatting > > characters in "[RLE]BLAH blah BLAH[PDF]" might (!) have been inserted by the > > mechanism you are proposing, the formatting characters in "[RLE]BLAH[PDF] blah > > [RLE]BLAH[PDF]" definitely were not, and to understand that, the app will need > > to scan right through the whole string. > > How would a user ever end up submitting text in this latter state? By pasting from some HTML page that uses bidi formatting characters :-) > Why would you not use dir=rtl in this case anyway? Let me make the example clearer with real text instead of blahs: [RLE]JOE[PDF] intends to call [RLE]SUSAN[PDF] This is an English sentence that happens to use some names in an RTL script. It is thus LTR. It needs to be displayed as EOJ intends to call NASUS which will only happen if it is displayed LTR. In RTL, it will be displayed as NASUS intends to call EOJ which actually reverses the meaning. > > As I said, I have no intention of using submitdir to support the use case of > > the user indicating the direction of individual paragraphs inside a textarea > > (as opposed to indicating the direction of all the paragraphs in a textarea at > > once). > > That seems a bit limited, but if it's really not something people want to do, > fair enough. I didn't say that people don't want it. I am saying that no one has figured out a way to give it to them, even in a full-featured plain text editor. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. You reported the bug.
Received on Thursday, 4 November 2010 21:19:52 UTC