RE: [iri] #121: BIDI: Some users are requiring right-to-left label ordering.

>> 2) You end up with different displays between places that "know" there's 
>> an IRI (e.g. browser address bar) and places that don't

That's unavoidable.  People will follow this RFC or they won't.  The Unicode Bidi Algorithm doesn't include this guidance, so plain text will also fail, though some apps may try to be "smarter".  For years people will have different browser versions with different behaviors, etc.  The UBA is also inconsistently applied, and at inconsistent revisions, so I think it's a bit presumptuous of us to think that anything we specify here could cause consistent rendering by our guidance :)

IMO: There's a more general "list" problem with the UBA, and that having the UBA address that might be interesting.

> Actually, the current solution was proposed by Mati Alluche, and he 
> argued that it would be possible for people to understand the ordering 
> because of the heuristics they use when reading mixed text:

That doesn't match our investigation.  That presumes that people read it as trained by the UBA, however when encountering list-like structures, people don't typically apply the UBA.  Unfortunately, regardless of the approach, some training of the user community is likely required.

>> My requirements are:
>> 1) The logical order of the parts MUST be preserved.

> That sounds like a very logical requirement :-). As always in the IETF, 
> any arguments/data to support that would be very much appreciated (your 
> list equivalent is certainly counting towards that).

I don't have a formal white paper user study.  This comes from discussions with native bidi speakers, technical, non-technical, and in-between.  Also from feedback from the community.  This is how we realized that IRI's are best treated like the "list" analogy.

Fortunately 90% of the most common cases are probably a loose domain, like the side of a bus, and those are probably all same-script IRIs.

>> 2) There MUST be a way for mostly Arabic, etc. IRIs to be rendered right to left.
>>  * So the corollary of 1&  2 is that the protocol has to go on the right

>By protocol, do you mean the scheme name (such as ftp:, mailto:, http:, 
>https:,...)?


>> 3) I'd really like a MAY that allows some flexibility for 2; when it's LTR and when it's RTL.

>You mean some flexibility depending on context? We could also make that 
>"MUST respect context". But then there's the problem that the context of 
>a side of a bus is rather vague :-).

Not if it's a bus in Cairo, or a bus in Washington DC.  Though either is probably going to be a single script.

>> At a minimum, I'd suggest that any RTL characters in the domain or email local parts should force 2).

> In my personal view, I think that might be overkill. I'm not sure I'd 
> want everything turned around just because of a few RTL characters. But 
> if that's what everybody agrees on, I won't stay in the way.

IMO this is mostly a user preference.  "I" would probably prefer the LTR ordering, even for an entirely Arabic IRI, because then I'd be able to understand the parts.  Eg: If the ordering were consistent, I could chomp off a subdomain to get to a parent domain, or remove the path part to get to the home page.  If that changes in the middle, I'd be unsuccessful.

> The really tough problem for anything that reorders by component (what 
> you call 'logical order of parts') is that it may be easy to write a 
> standard that says so, but it's difficult to implement. Any thoughts 
> about that?

Yes :)  I'd be much happier coming up with a behavior that's understandable by 90% of the humans and have problems implementing it, than causing ambiguity for 50% of the population just because it was easy to implement.

We also came up with a couple practical observations: 
Many paths are "long".  They are also likely mostly ASCII for the foreseeable future.  If I render a path with http:// on the left, and an Arabic domain name, then a path on the right, an RTL user with an RTL address bar will have a hard time discovering the domain, which is the most important part of the IRI, because it won't be near the right side of the textbox.

Worse, if the path/query gets long enough, then you have 2 really bad options:  Either allow the host name to be cropped from the left of the address bar, or clip the path on the RIGHT side, like an LTR textbox, impacting the usability of the RTL app.

Received on Monday, 2 April 2012 17:56:11 UTC