- From: Aharon (Vladimir) Lanin <aharon@google.com>
- Date: Thu, 15 Sep 2016 20:29:47 +0300
- To: r12a <ishida@w3.org>
- Cc: Simon Montagu <smontagu@smontagu.org>, Martin J. Dürst <duerst@it.aoyama.ac.jp>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>, Roozbeh Pournader <roozbeh@google.com>, Shervin Afshar <shervinafshar@gmail.com>, Mostafa Hajizadeh <mostafa@daftar.cc>
- Message-ID: <CA+FsOYaEVRXxgusyK2MHsTh_B5ZMwVpVf-_m4azUiTxLAQrCZw@mail.gmail.com>
I do not think that there is a problem with the directionality algorithm. Aside from alignment, it makes no difference whether that isolate paragraph is displayed as LTR overall or RTL overall. And for much more important and common cases of all-neutral text - phone numbers and, to a smaller extent, dates, times, and signed numbers - its being LTR by default is extremely important. I *do* think that there is a problem with the definition of start alignment, which is not controlled by Unicode, but by various other specs, such as CSS. The problem affects the alignment of all-neutral paragraphs generally, like the examples cited above, not just the corner case of an isolate. Start alignment is generally defined as "left" for an LTR paragraph, and "right" for an RTL paragraph, and this means that an all-neutral paragraph, which is LTR by the UBA, is left-aligned - even if the alignment outside that paragraph is right. This makes the all-neutral paragraph needlessly different from its surroundings and usually looking pretty bad. The solution is to make start alignment match the alignment outside the paragraph if the paragraph's directionality is determined from its content and its content is all-neutral. In the context of CSS, the relevant spec is that of the start and end edges of a line box whose containing block has ‘unicode-bidi: plaintext’ ( https://www.w3.org/TR/css-text-3/#bidi-linebox). It already makes an exception for an empty line box, which "takes its inline base direction from the preceding line box (if any), or, if this is the first line box in the containing block, then from the ‘direction’ property of the containing block." I think that this exception should be broadened to an all-neutral line box. On Thu, Sep 15, 2016 at 1:15 PM, r12a <ishida@w3.org> wrote: > On 15/09/2016 10:49, Simon Montagu wrote: > >> On 15/09/16 07:51, r12a wrote: >> >>> On 15/09/2016 05:44, Martin J. Dürst wrote: >>> >>>> This is a very high level, speculative comment, but I'll make it anyway: >>>> >>>> You sound as if the isolates are too isolated. My understanding is that >>>> we introduced the isolates because the embeddings were not independent >>>> (isolated) enough and interacted with their surroundings too much. >>>> >>>> Did we overdo (if maybe even just so sligthly) the isolation when we >>>> created isolates? Or would we (at least in theory) need a third kind of >>>> range, somewhere in between isolates and embeddings in independency? >>>> >>> >>> i don't think the level of isolation is the problem, i think it's more >>> to do with an isolated range being treated as a neutral character >>> (whereas a non-isolated embedded range (eg. RLE) is treated as a strong >>> character). >>> >>> ri >>> >>> >> That sounds to me like the same issue: as soon as an embedded sequence >> is treated as a strong character, it stops being isolated: for example >> it can affect the resolved level of an adjacent numeral. IIUARC this was >> one of the chief reasons, if not THE reason, for treating isolated >> sequences as neutral characters in their containers >> > > i agree that it's probably an inseparable issue. The question is how to > ascertain that a string like "RLI فعالیت بینالمللیسازی، PDI", which i > think should be regarded by default as a RTL string can be perceived as > such - especially if those controls have been added by something else along > the way, such as an application that wraps strings, and which therefore > removes the previously existing clues. > > Asmus, i hear what you're saying about higher level protocols, but i can't > help thinking that those protocols would need to be adopted by just about > any application that deals with strings of this kind - which makes me think > that perhaps there should be a standard mechanism described by the UBA (?). > > ri > > > > > > > >
Received on Thursday, 15 September 2016 17:30:38 UTC