- From: Asmus Freytag (c) <asmusf@ix.netcom.com>
- Date: Wed, 14 Sep 2016 18:13:40 -0700
- To: r12a <ishida@w3.org>, public-i18n-bidi@w3.org
- Cc: Roozbeh Pournader <roozbeh@google.com>, "Aharon (Vladimir) Lanin" <aharon@google.com>, Shervin Afshar <shervinafshar@gmail.com>, Mostafa Hajizadeh <mostafa@daftar.cc>
- Message-ID: <520b2b11-e464-6d20-e243-44a32cf288da@ix.netcom.com>
The way I read this is that the rendering of the visible text (for a bare RLI/PDI without surrounding text) works as intended, whether or not the UBA resolves the outer (empty) paragraph as LTR or RTL. I think that this is a limitation of the UBA. It is concerned with ordering the characters on a line, but not with laying out paragraphs (or pages). So when it comes to CSS (or any other protocol) using the data to make decisions on paragraph or page layout, then that protocol may need to augment its rules to go beyond UBA. Note that UAX#9 states in HL1: /Override P3 <http://unicode.org/reports/tr9/#P3>, and set the paragraph embedding level.... "/A higher-level protocol may use an entirely different algorithm that heuristically auto-detects the paragraph embedding level based on the paragraph text and its context." So, a conformant higher level protocol could be designed to detect the case discussed here and decide to base the implementation of paragraph layout (alignment) and page layout based on the type of isolate or even it's contents. The problem with simply adding an RLM is that now you suddenly have an issue when you want to concatenate two strings (perhaps a bare LTR and a bare RTL isolate). Taking the latter case, an (otherwise empty) paragraph containing isolates of either kind, say one LTR or one RTL, if the RLI was routinely "augmented" by prefixing it with an RLM, but the LRI was not (based on that P3 would resolve the paragraph to LTR anyway) then combining the two would *always* result in an RTL paragraph. If both types were augmented (a LRM added before a LRI), the first one in sequence would rule. So, you might as well go ahead and not augment these, but stipulate that the higher level protocol you care about use a heuristic for treating paragraphs consisting only of isolates. A./ On 9/14/2016 10:54 AM, r12a wrote: > [moving this discussion to the list, with Roozbeh's agreement, and > reordering the previous posts so that all is chronologically oldest to > newest] > > > > On Mon, Sep 12, 2016 at 11:17 AM, r12a <ishida@w3.org > > <mailto:ishida@w3.org>> wrote: > > > > hi Roozbeh, > > > > i have a bidi question for you, if you don't mind. > > > > the UBA says that the paragraph direction can be determined by > > looking for the first strong directional character, ignoring > > sequences of characters surrounded by isolating controls, and > > defaulting to LTR in the absence of any strong RTL character. > > > > the alignment of a string when displayed tends to be derived from > > the paragraph direction, if i understand correctly. > > > > so, what happens for a string such as > > > > "RLI فعالیت بینالمللیسازی، PDI" > > > > which you'd expect to be displayed from the right side of the > > window, but for which no strong character would be detected by the > > algorithm? Is there something i'm missing that would look into the > > next level down if no strong character were detected in the highest > > level? > > > > i expect that this would affect lots of strings passed around by > > scripts. > > > > cheers, > > ri > > > > On 13/09/2016 18:53, Roozbeh Pournader wrote: > > Hi Richard, > > > > Well, since there's no strong character visible to P2, that paragraph > > will be resolved to LTR according to P3. > > > > This is intentional, as the isolates are supposed to be exactly that, > > isolate the inside from the outside and the outside from the inside. > > > > If a script wants to make sure the string is displayed RTL, it should > > add an RLM at the beginning. > > > New contribution: > > Interesting. This question arose while i was trying to clarify how > best to handle paragraph base direction for strings such as would be > encountered in JSON. See > http://w3c.github.io/i18n-discuss/notes/json-bidi.html > > we're trying to find ways to keep the paragraph direction associated > with the string, so that the string is correctly displayed when used > in a web page or such. > > you'll see that we look at the possibility of wrapping strings in > RLI..PDI as one possible approach, but this means that actually the > consumer of the string would not know, in such a case, that the string > is RTL. > > If it keeps the control characters, it would display the contained > text correctly, but it would presumably be difficult to choose an > appropriate dir value if the preference was to use markup for > direction in the destination. Also, if the direction of the string is > used to determine the alignment on the page (left or right), as i > think is the case for CSS, then the rendering application would not > get the right cue. > > ri > > >
Received on Thursday, 15 September 2016 01:14:00 UTC