Re: Summary of bidi discussion with Social Web WG

hi Amir, thanks for the comments.  See below...

On 03/08/2016 13:16, Amir E. Aharoni wrote:
> I read the linked discussions... People all over the RTL world from
> Morocco to Pakistan, struggle with RTL typing. Regular pIeople, not i18n
> nerds like me. They are struggling even in much more traditional
> environments like desktop word processors, not to mention massively
> multilingual websites. So the fact that there are millions of them and
> they are used to something doesn't mean much by itself, because whatever
> they are used to is likely not very good.

Yes, i tried to say that.  Unfortunately, we rarely get feedback from 
people who struggle in this way, so the perception of the developers we 
spoke with is that there's not really a problem. :(

> On the other hand, any attempts to force people to explicitly set
> direction are probably doomed to failure. It's something that would be
> comfortable for me, but experience shows that most people have don't
> want to understand explicit direction setting, and they just want to
> type letters. Optional explicit direction setting is something that I'd,
> personally, appreciate very much for edge cases when auto-detection
> doesn't work, but I could live without it.

But sometimes (eg. the MAC address example in the tests) it's really 
needed, unless perhaps you put things on new lines, which i've seen 
people sometimes resort to for bidi text.

Btw, that was one of the reasons we changed our advice for HTML 
authoring in our articles. We went from trying to explain the reasons 
for problems and how to address them in the best way (eg. using RLM/LRM 
here and isolation there) to just saying "Put markup with dir around 
everything that changes direction."

I think it's harder, though, when using controls to use embedding or 
isolating than it is in markup, so often it's easier to just add RLM/LRM 
wherever you can, but i agree that using controls not easy at the best 
of times.

On the other hand, I find myself wondering whether others also find it's 
easier to work with controls when writing short social media strings 
(like tweets), since you write and send quite quickly, perhaps throwing 
in a control or newline here or there as you type.  I think editing 
existing bidirectional text to add embedding or isolating control codes 
is much more challenging, since they cause the text to move around so 
much as you type.

> First-strong is not nearly enough, period. Especially in social networks
> and chat apps, where strings very frequently begin with the name of a
> person, and that name is very frequently written in Latin characters.
> Same for brand names, etc.

Twitter looks out for @ and # characters, and applies special processing 
to those and the characters that follow them if they are in Latin 
script. That processing effectively isolates them and makes them run LTR.

> Automatic detection works in chat mobile apps and social networks like
> YouTube, Twitter and Facebook is not perfect, but usually it works
> surprisingly well. But every app implements it separately. In general it
> seems that it mostly works by counting characters or words. Making one
> of these algorithms standard would be far better than standardizing
> first-strong. It's unfortunate that first-strong was picked for HTML's
> dir="auto", too.

 From the test results[1] it looks to me as if Facebook uses 
first-strong to set the base direction, however Twitter uses a different 
algorithm.  Interestingly, Twitter does much worse for the tests i ran 
than Facebook.  That may be because the strings are not long, but most 
of them are not unusually short either.



Received on Wednesday, 3 August 2016 12:44:36 UTC