First strong on strings surrounded by isolate controls

[moving this discussion to the list, with Roozbeh's agreement, and 
reordering the previous posts so that all is chronologically oldest to 
newest]


 > On Mon, Sep 12, 2016 at 11:17 AM, r12a <ishida@w3.org
 > <mailto:ishida@w3.org>> wrote:
 >
 >     hi Roozbeh,
 >
 >     i have a bidi question for you, if you don't mind.
 >
 >     the UBA says that the paragraph direction can be determined by
 >     looking for the first strong directional character, ignoring
 >     sequences of characters surrounded by isolating controls, and
 >     defaulting to LTR in the absence of any strong RTL character.
 >
 >     the alignment of a string when displayed tends to be derived from
 >     the paragraph direction, if i understand correctly.
 >
 >     so, what happens for a string such as
 >
 >     "RLI فعالیت بین‌المللی‌سازی، PDI"
 >
 >     which you'd expect to be displayed from the right side of the
 >     window, but for which no strong character would be detected by the
 >     algorithm? Is there something i'm missing that would look into the
 >     next level down if no strong character were detected in the highest
 >     level?
 >
 >     i expect that this would affect lots of strings passed around by
 >     scripts.
 >
 >     cheers,
 >     ri



On 13/09/2016 18:53, Roozbeh Pournader wrote:
 > Hi Richard,
 >
 > Well, since there's no strong character visible to P2, that paragraph
 > will be resolved to LTR according to P3.
 >
 > This is intentional, as the isolates are supposed to be exactly that,
 > isolate the inside from the outside and the outside from the inside.
 >
 > If a script wants to make sure the string is displayed RTL, it should
 > add an RLM at the beginning.


New contribution:

Interesting.  This question arose while i was trying to clarify how best 
to handle paragraph base direction for strings such as would be 
encountered in JSON.  See 
http://w3c.github.io/i18n-discuss/notes/json-bidi.html

we're trying to find ways to keep the paragraph direction associated 
with the string, so that the string is correctly displayed when used in 
a web page or such.

you'll see that we look at the possibility of wrapping strings in 
RLI..PDI as one possible approach, but this means that actually the 
consumer of the string would not know, in such a case, that the string 
is RTL.

If it keeps the control characters, it would display the contained text 
correctly, but it would presumably be difficult to choose an appropriate 
dir value if the preference was to use markup for direction in the 
destination.  Also, if the direction of the string is used to determine 
the alignment on the page (left or right), as i think is the case for 
CSS, then the rendering application would not get the right cue.

ri

Received on Wednesday, 14 September 2016 17:54:33 UTC