RE: Concerns about new domain names, particularly non-Latin-scripts -- getting the tech community together

> <method> : // <host> / <path>
> where <host> and <path> are allowed to be RTL

> In many cases, there's no way to modify software to both make the URLs look nice (as users expect) and be consistent with the Unicode standard.

Which is why we (Microsoft) are intending to be consistent having the sections arranged consistently to the right or to the left for a single link.  The trick is what delimits the parts.  Which led to the examples like http://ltra.ltrb and LTRB.LTRA//:http - there's no way to predict how flipping of the important bits is going to confuse the reading of the label if you start trying to make runs of LTR and RTL that cross the logical sections of the URL.

Mark Davis & I chatted about whether Unicode should suggest a way of handling things like this in BIDI.  If you extrapolate the problem, this isn't the only issue with the BIDI algorithm in contexts like these.  For example, I could have a list of winners of a contest:  The top finishers are apple, banana, carrot, durian, and eggplant.  If a couple of adjacent ones in that list are in a different script, then the list is going to get confused without helper bidi marks.  Perhaps there would be a general way to tweak the algorithm for runs of ordered text?

-Shawn

Received on Thursday, 29 October 2015 17:13:56 UTC