- From: <ishida@w3.org>
- Date: Wed, 3 Aug 2016 13:44:23 +0100
- To: "Amir E. Aharoni" <amir.aharoni@mail.huji.ac.il>
- Cc: "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "public-i18n-bidi@w3.org" <public-i18n-bidi@w3.org>
hi Amir, thanks for the comments. See below... On 03/08/2016 13:16, Amir E. Aharoni wrote: > I read the linked discussions... People all over the RTL world from > Morocco to Pakistan, struggle with RTL typing. Regular pIeople, not i18n > nerds like me. They are struggling even in much more traditional > environments like desktop word processors, not to mention massively > multilingual websites. So the fact that there are millions of them and > they are used to something doesn't mean much by itself, because whatever > they are used to is likely not very good. Yes, i tried to say that. Unfortunately, we rarely get feedback from people who struggle in this way, so the perception of the developers we spoke with is that there's not really a problem. :( > On the other hand, any attempts to force people to explicitly set > direction are probably doomed to failure. It's something that would be > comfortable for me, but experience shows that most people have don't > want to understand explicit direction setting, and they just want to > type letters. Optional explicit direction setting is something that I'd, > personally, appreciate very much for edge cases when auto-detection > doesn't work, but I could live without it. But sometimes (eg. the MAC address example in the tests) it's really needed, unless perhaps you put things on new lines, which i've seen people sometimes resort to for bidi text. Btw, that was one of the reasons we changed our advice for HTML authoring in our articles. We went from trying to explain the reasons for problems and how to address them in the best way (eg. using RLM/LRM here and isolation there) to just saying "Put markup with dir around everything that changes direction." I think it's harder, though, when using controls to use embedding or isolating than it is in markup, so often it's easier to just add RLM/LRM wherever you can, but i agree that using controls not easy at the best of times. On the other hand, I find myself wondering whether others also find it's easier to work with controls when writing short social media strings (like tweets), since you write and send quite quickly, perhaps throwing in a control or newline here or there as you type. I think editing existing bidirectional text to add embedding or isolating control codes is much more challenging, since they cause the text to move around so much as you type. > First-strong is not nearly enough, period. Especially in social networks > and chat apps, where strings very frequently begin with the name of a > person, and that name is very frequently written in Latin characters. > Same for brand names, etc. Twitter looks out for @ and # characters, and applies special processing to those and the characters that follow them if they are in Latin script. That processing effectively isolates them and makes them run LTR. > Automatic detection works in chat mobile apps and social networks like > YouTube, Twitter and Facebook is not perfect, but usually it works > surprisingly well. But every app implements it separately. In general it > seems that it mostly works by counting characters or words. Making one > of these algorithms standard would be far better than standardizing > first-strong. It's unfortunate that first-strong was picked for HTML's > dir="auto", too. From the test results[1] it looks to me as if Facebook uses first-strong to set the base direction, however Twitter uses a different algorithm. Interestingly, Twitter does much worse for the tests i ran than Facebook. That may be because the strings are not long, but most of them are not unusually short either. ri [1] https://github.com/w3c/i18n-activity/wiki/Bidi-handling-in-Facebook-and-Twitter
Received on Wednesday, 3 August 2016 12:44:36 UTC