W3C home > Mailing lists > Public > public-iri@w3.org > May 2010

Re: Special ordering for BIDI URLs

From: Slim Amamou <slim@alixsys.com>
Date: Sun, 30 May 2010 19:03:55 +0100
Message-ID: <AANLkTikaG_FD0IeK0dH8wChyNovix-JyTcMxM1rNpYJs@mail.gmail.com>
To: Matitiahu Allouche <matial@il.ibm.com>
Cc: Adil Allawi <adil@diwan.com>, "aharon@google.com" <aharon@google.com>, "bidi@unicode.org" <bidi@unicode.org>, bidi-bounce@unicode.org, Mark Davis ☕ <mark@macchiato.com>, Murray Sargent <murrays@exchange.microsoft.com>, Nasser Kettani <Nasser.Kettani@microsoft.com>, "public-iri@w3.org" <public-iri@w3.org>, Shawn Steele <Shawn.Steele@microsoft.com>

On Sun, May 30, 2010 at 5:27 PM, Matitiahu Allouche <matial@il.ibm.com>wrote:

> It seems clear that there is no ideal solution for this issue. If there was
> one, I think that somebody would have come forward with it already.  So, any
> solution must be a compromise which favors the considerations that the
> author sees as most important and somehow shoves aside those considered
> secondary.
> For what it's worth, I will write below my own preferences.  They are based
> on the following premises.
> a) Pure RTL URLs are not practical currently, because of the scheme (http
> etc...) and the extension (html, asp, php etc...).  Localizing them on the
> client side would be a vast effort with hard issues of coordination,
> education and likely also politics.
> b) Adding duplicates of URL delimiters with special Bidi properties (Adil's
> proposal) raises its own problems which Mark Davis has enumerated in his
> note dated May 28th.
> Note also that it assumes using Unicode, while many Hebrew and Arabic pages
> use windows-1255 and windows-1256 charsets.  This is also a constraint in my
> proposed solution below.
> c) My main consideration is that a person reading a URL from a bus side or
> a napkin must be able to unequivocally understand the intended order of the
> different parts of the URL.
> Consequently, the parts must be laid out in a uniform direction, although
> each part will be displayed according to the Unicode Bidi Algorithm (UBA).
>  For congruity with non-Bidi URLs, the uniform direction will be LTR.
> Given the above, the technical proposal is as follows:
> 1) For presentation, a Bidi URL must be preceded by LRE and followed by
> PDF, unless
>   1.1 it starts with a LTR character AND contains no RTL character AND ends
> with a LTR character or a digit
> OR
>   1.2 the context (e.g. paragraph direction) is LTR.
> 2) For presentation, a part of a URL will be preceded by LRM if
>   2.1 there is a preceding part which contains RTL characters
>   2.2 the current part contains RTL characters OR has digits before any
> strong LTR character.
> 3) All such formatting characters (LRE, PDF and LRMs) will be stripped
> before sending to the server side.
> 4) From the registration point of view, only the stripped version of the
> URL needs to be registered.  Versions including formatting characters are
> not allowed for registration.
> 5) Bidi-URL-aware user agents should facilitate user entry of URLs by
> adding the proper formatting characters while typing, or at least when the
> user confirms the data (by pressing Enter or a similar action).
> 6) All user agents must remove formatting characters from URLs before
> sending on the wire.
> And yes, I am conscious that the transition period will be, euphemistically
> speaking, challenging.  But this is true for any proposed change, and it is
> better to suffer while getting to a good place than while staying in a bad
> one.
> Shalom (Regards),  Mati
>           Bidi Architect
>           Globalization Center Of Competency - Bidirectional Scripts
>           IBM Israel
>           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile: +972 52
> 2554160

Slim Amamou | سليم عمامو
Received on Sunday, 30 May 2010 18:04:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:41 UTC