RE: [bidi] BIDI?

Could someone summarize the requirements for BIDI representation and display, and the design choices we’re facing and how they match up against the requirements?

It seems to me that we’re in the unfortunate situation that the “desirable” handling of IRIs for BIDI identifiers cannot actually be accomplished with the technology at hand, and that we’re going to have to wind up with the unfortunate but unavoidable situation where we have to make some compromises to get something that will work at all.

These lengthy discussions about “desirable” handling of BIDI URIs don’t help much if we’re not actually evaluating technological solutions.

It may be that “side of bus” printing for BIDI IRIs are limited, for example, or that we might need to establish some other additional typographical conventions for side-of-bus display of BIDI IRIs.

The technology we have at hand is pretty weak
– can non-visible direction characters be part of the IRI?

-          Can we, should we, advise those who are implementing novel IRI display mechanisms (“show IRI in address bar”) and IRI entry mechanisms (“type IRI in the address bar”) to do something different from the ordinary “give IRI string to ordinary unicode string display mechanism”.

Larry
--
http://larry.masinter.net


From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On Behalf Of Shawn Steele
Sent: Sunday, June 05, 2011 12:20 PM
To: Aharon (Vladimir) Lanin; Matitiahu Allouche
Cc: bidi@unicode.org; bidi-bounce@unicode.org; Mark Davis ☕; Mohamed Mohie; public-iri@w3.org; public-iri-request@w3.org
Subject: RE: [bidi] BIDI?


I think that the "side of a bus" case often skips the http:// part, so it does matter a bit.



FWIW: I don't think we have to worry about this case so much:

http://worldblog.msnbc.msn.com/_news/2011/06/05/6789539-amid-the-ruins-a-fisherman-contemplates-a-daunting-future


but rather simpler cases like "msnbc.com", "biz.host.com", or "host.com/biz", since that's what's on the side of a bus.  And, of course, the email variations.



I don't know if that "simplifies" the problem any, but a few RTL characters deep in an obscure file path in an otherwise LTR string probably aren't very interesting.



Also, ancedotal evidence suggests that the "average" user may not be aware that www.msnbc.com<http://www.msnbc.com> means "the www server at msnbc, which registered with .com".  It can be misinterpreted as "msnbc's part of the web (www)", eg, msnbc somehow registered with www.  So I don't think we can ensure that LTR or RTL ordering preserves some sort of security heirarchy, at least for the average user.



I think the key point is "how do we get someone to write it down and key it in later without any mistakes"?


-Shawn

 
http://blogs.msdn.com/shawnste


________________________________
From: Aharon (Vladimir) Lanin [aharon@google.com]
Sent: Sunday, June 05, 2011 11:07 AM
To: Matitiahu Allouche
Cc: bidi@unicode.org; bidi-bounce@unicode.org; Mark Davis ☕; Mohamed Mohie; public-iri@w3.org; public-iri-request@w3.org; Shawn Steele
Subject: Re: [bidi] BIDI?
You have a point, although for http://MY.DOMAIN.org and http://org.DOMAIN.MY, the results would be different:
org.NIAMOD.YM//:http and http://org.NIAMOD.YM, respectively.

Aharon
On Sun, Jun 5, 2011 at 7:17 PM, Matitiahu Allouche <matial@il.ibm.com<mailto:matial@il.ibm.com>> wrote:
Aharon (Vladimir) Lanin wrote: "To my taste, first strong in the domain name is best".
First strong in the domain name fails the napkin test. If the logical name is (upper case = RTL):
      MY.DOMAIN.org<http://MY.DOMAIN.org>
it would be displayed
      org.NIAMOD.YM

Such a display could come from the logical name "MY.DOMAIN.org<http://MY.DOMAIN.org>", but also from "org.MY.DOMAIN", thus it is not unambiguous.


Shalom (Regards),  Mati



From:        "Aharon (Vladimir) Lanin" <aharon@google.com<mailto:aharon@google.com>>
To:        Matitiahu Allouche/Israel/IBM@IBMIL
Cc:        Shawn Steele <Shawn.Steele@microsoft.com<mailto:Shawn.Steele@microsoft.com>>, bidi@unicode.org<mailto:bidi@unicode.org>, bidi-bounce@unicode.org<mailto:bidi-bounce@unicode.org>, "public-iri@w3.org<mailto:public-iri@w3.org>" <public-iri@w3.org<mailto:public-iri@w3.org>>, Mohamed Mohie <MOHIEM@eg.ibm.com<mailto:MOHIEM@eg.ibm.com>>, public-iri-request@w3.org<mailto:public-iri-request@w3.org>, Mark Davis ☕ <mark@macchiato.com<mailto:mark@macchiato.com>>
Date:        05/06/2011 18:43
Subject:        Re: [bidi] Re: BIDI?
________________________________



I think that there needs to be a secondary objective: to get all-rtl iris displayed rtl overall, not in a constant back-and-forth at every separator. Like Mohammed, I think that this should be based on the presence of rtl in the domain name. To my taste, first strong in the domain name is best, but I think that the exact algorithm to use (on the domain name) is less important.

Aharon

On Jun 5, 2011 10:27 AM, "Matitiahu Allouche" <matial@il.ibm.com<mailto:matial@il.ibm.com>> wrote:
> Please define "mostly Latin" and "mostly Arabic or Hebrew".
>
> Are you suggesting to count LTR and RTL characters? Are they all equally
> weighted?
> Does the counting include the scheme (e.g. "http")? the TLD?
>
> Please consider that the prime objective, IMHO, is to enable easy and
> unambiguous human translation from a displayed IRI (napkin, bus side) to
> the corresponding logical string.
>
> Shalom (Regards), Mati
> Bidi Architect
> Globalization Center Of Competency - Bidirectional Scripts
> IBM Israel
> Fax: +972 2 5870333<tel:%2B972%202%205870333> Mobile: +972 52 2554160<tel:%2B972%2052%202554160>
>
>
>
>
> From: Mohamed Mohie <MOHIEM@eg.ibm.com<mailto:MOHIEM@eg.ibm.com>>
> To: Matitiahu Allouche/Israel/IBM@IBMIL
> Cc: bidi@unicode.org<mailto:bidi@unicode.org>, bidi-bounce@unicode.org<mailto:bidi-bounce@unicode.org>, Mark Davis ☕
> <mark@macchiato.com<mailto:mark@macchiato.com>>, "public-iri@w3.org<mailto:public-iri@w3.org>" <public-iri@w3.org<mailto:public-iri@w3.org>>, Shawn
> Steele <Shawn.Steele@microsoft.com<mailto:Shawn.Steele@microsoft.com>>
> Date: 03/06/2011 22:06
> Subject: Re: [bidi] Re: BIDI?
> Sent by: public-iri-request@w3.org<mailto:public-iri-request@w3.org>
>
>
>
> Hello Mati,
> To overcome the problem you highlighted below I have a suggestion to be
> added for the URL design which is to set the embedding level according to
> the directionality of the domain name.
> 1- If the domain name "MY.OWN.DOMAIN" is mostly Latin set the embedding
> level to even.
> 2- If the domain name "MY.OWN.DOMAIN" is mostly Arabic or Hebrew set the
> embedding level to odd.
>
> Thanks And Best regards,
> Mohamed Mohie , PMP®
> ________________________________________________
> GCoC BIDI ,
> Advisory Software Engineer, Project Manager, M.Sc.
> Cairo Technology Development Center (CTDC)
> IBM Egypt
> email : mohiem@eg.ibm.com<mailto:mohiem@eg.ibm.com>
>
>
>
>
>
> From: Matitiahu Allouche <matial@il.ibm.com<mailto:matial@il.ibm.com>>
> To: Mark Davis ☕ <mark@macchiato.com<mailto:mark@macchiato.com>>
> Cc: bidi@unicode.org<mailto:bidi@unicode.org>, bidi-bounce@unicode.org<mailto:bidi-bounce@unicode.org>, "public-iri@w3.org<mailto:public-iri@w3.org>"
> <public-iri@w3.org<mailto:public-iri@w3.org>>, Shawn Steele <Shawn.Steele@microsoft.com<mailto:Shawn.Steele@microsoft.com>>
> Date: 27/04/2011 10:38 ص
> Subject: [bidi] Re: BIDI?
> Sent by: bidi-bounce@unicode.org<mailto:bidi-bounce@unicode.org>
>
>
>
> Hello, Mark!
>
> I am glad to see somebody daring to tackle this issue.
>
> You wrote: <quote>
> If a bidiIri is recognized, then it is handled by the UBA as if each
> separator is surrounded by:
> LRM (if the embedding level is even) or
> RLM (if the embedding level is odd)
> <end of quote>
>
> This design has the following consequences, which IMHO are not optimal:
> a) The same URL (IRI) will be displayed differently according to the
> embedding level. This is confusing.
> b) Pure Latin-character URLs will be displayed in a new and strange way
> when the embedding level is odd. For instance, "htttp://docs.google.com<http://docs.google.com/>"
> will be displayed as "com.google.docs//:http".
>
> Consequently, I second Slim Amamou's suggestion to "have a
> predefined/enforced directionality in the specs for each scheme? (ex. LTR
> for URLs)".
> It is true that pure or mostly Hebrew or Arabic URLs will be displayed in
> a
> way which may seem strange. For instance, "http://MY.OWN.DOMAIN.com<http://my.own.domain.com/>"
> (where
> upper case letters represent RTL letters) will be displayed as "
> http://YM.NWO.NIAMOD.com<http://ym.nwo.niamod.com/>", but
> 1. The scheme and the TLD currently are pure LTR, and I guess that this is
> not going to change soon, so the display of mixed LTR/RTL URLs will be
> strange anyway.
> 2. The use of domain names with RTL labels is still scarce, there is no
> common usage to overcome, so the public will get accustomed to the
> "strange" display right from the beginning.
>
>
> Shalom (Regards), Mati
> Bidi Architect
> Globalization Center Of Competency - Bidirectional Scripts
> IBM Israel
> Fax: +972 2 5870333<tel:%2B972%202%205870333> Mobile: +972 52 2554160<tel:%2B972%2052%202554160>
>
>
>
>
> From: Mark Davis ☕ <mark@macchiato.com<mailto:mark@macchiato.com>>
> To: Shawn Steele <Shawn.Steele@microsoft.com<mailto:Shawn.Steele@microsoft.com>>
> Cc: "public-iri@w3.org<mailto:public-iri@w3.org>" <public-iri@w3.org<mailto:public-iri@w3.org>>, bidi@unicode.org<mailto:bidi@unicode.org>
> Date: 27/04/2011 02:24
> Subject: [bidi] Re: BIDI?
> Sent by: bidi-bounce@unicode.org<mailto:bidi-bounce@unicode.org>
>
>
>
> Here are some rough thoughts on how we could handle bidi IRIs.
>
> http://goo.gl/QwSoo

>
> Feedback is welcome.
>
> Mark
>
> On Wed, Apr 20, 2011 at 23:20, Shawn Steele <Shawn.Steele@microsoft.com<mailto:Shawn.Steele@microsoft.com>>
> wrote:
> I'm wondering what the current thinking around BIDI IRIs is? A few things
> in draft-ietf-iri-3987bis-05 jump out at me.
>
>
> -Shawn>

Received on Monday, 6 June 2011 12:54:05 UTC