Re: Bidi in JSON notes

On 15/08/2016 07:36, Dov Grobgeld wrote:
> I'm not sure that I understand the purpose of this note. JSON is a
> serialization protocol and not a display standard. The display is a
> front end issue, and at the time the data is displayed, the connection
> to JSON has been disconnected. So you might be discussing a subset of
> JSON used within certain web standards. This should be made more specific.

I believe that the issue is that there needs to be some information 
about base direction passed with the JSON string in order to reconsitute 
the string correctly for visual consumption at some point in the future. 
  The question is how to best carry that information.

> But this can be generalized into saying that in order to display text
> properly you may need additional meta data. This corresponds to the
> concept of a "higher-level protocol" as expressed in UAX #9. But
> obviously this can be expressed in any serialization protocol, e.g.
> protobuf, XML, BSON or pickle.

We have been talking with people working on JSON-based standards for 
data interchange who say that it's best to pass base direction metadata 
as part of the string, rather than in accompanying properties. The 
discussion page is thinking around the issues.

> On the other hand, the display of structured BiDi text (e.g. JSON and
> XML) in a text editor, is indeed an issue that needs various
> recommendations. In my opinion the best solution to this is the
> insertion of of the relatively new bidi isolate characters according to
> the syntax right before the rendering. E.g. The example:
>
> |"content" : "‭#bidi PEILUT MUTEMET, w3c"
>
> |
> would be changed before display to:
> ​
> |"&FSI;content&PDI;" : "‭&FSI;#bidi PEILUT MUTEMET, w3c&PDI;"
> |
> This would ensure that there is no influence of what is inside quotes on
> the surrounding. Unfortunately I'm not aware of any editor capable of
> doing this today.

And i'm only aware of one browser that can understand the isolating 
controls at the moment (Firefox).

I think there are a number of issues with using paired controls, which 
is why i hadn't proposed that solution so far.

1. the issue about current support for isolating characters, which you 
mention above, though hopefully that will go away at some point soon, is 
a key one. Using RLE etc to establish base direction for text which is 
likely to be inserted into a context is dangerous.

2. it seems that users are much less likely to be able to, or to know 
how to, input paired characters (even RLE etc) than RLM, which is 
already problematic enough.  On the other hand, applications that create 
the data may be able to do so, except that they really ought to be 
isolating.

3. using paired controls to indicate direction for strings that start 
and end with markup is more problematic than using RLM, but no more 
effective (it's still not clear that using RLM in this case is useful, 
although it's easier than changing the markup).  In other words, there's 
more of a difference between plain text strings and marked up strings 
than when using RLM.

4. i'm wary about the potential for incorrect spans when passing around 
paired markers, though it may not be as much of an issue as i think.

These are all just thoughts off the top of my head, so feel free to come 
back on them.

> Finally, for showing BiDi examples in technical documents, the standard
> is to use capital letters for RTL characters and small for other
> characters. See:
>
> http://unicode.org/reports/tr9/

Yes, but examples using capital Latin characters can't be copied and 
pasted to run various tests in the way that the examples on the 
discussion page can. That's part of the rationale here. ;-)

thanks for the comments,
ri

Received on Monday, 15 August 2016 22:07:34 UTC