- From: Matitiahu Allouche <matial@il.ibm.com>
- Date: Sun, 6 Nov 2011 10:25:32 +0200
- To: Adil Allawi <adil@diwan.com>
- Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "public-iri@w3.org" <public-iri@w3.org>
- Message-ID: <OF2CBA8F1B.87F196A5-ONC2257940.002C8366-C2257940.002E50A6@il.ibm.com>
Hello, Adil!
I tend to like your proposal, but I have not yet formed a definitive
opinion. However I can already comment that your third rule seems to me
too restrictive.
You wrote:
3. The characters of a registered domain MUST match the Unicode bidi
class of the TLD if the TLD is an RTL-TLD.
Since your rules 1 and 2 constrain an RTL-TLD to contain only characters
with bidi class R or only with bidi class AL, this rule forbids domain
names such as "ABC-DE" although hyphen is allowed even in LDH labels. I
think that rule 3 should be relaxed to allow innocuous characters to
appear at innocuous locations (inside a label and not at its ends).
I leave it to you to define what are "innocuous characters". At first
glance, I would say anything except characters with bidi class L, but this
needs some more reflection.
Shalom (Regards), Mati
Bidi Architect
Globalization Center Of Competency - Bidirectional Scripts
IBM Israel
Mobile: +972 52 2554160
From: Adil Allawi <adil@diwan.com>
To: "public-iri@w3.org" <public-iri@w3.org>
Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Date: 05/11/2011 23:24
Subject: Bidi IRI with a Bidi TLD
Dear all,
Following is my suggestion for a new section in the bidi iri document.
This is to improve useability. The main point of this proposal is that the
domain and TLD always appear together in a URL so that a user can read,
enter, highlight and copy it. Also, so that a user looking at a bi-di URL
will always recognize the domain part.
The restriction I propose is only for the main domain eg. the "google" in
"google.com" not the subdomain e.g. the "translate" in
"translate.google.com". That is name registered with the domain registrar.
Adil
------
Restrictions on domain names for Top Level Domains (TLDs)
Definition: Right-To-Left Top Level Domains (RTL-TLD). These are top-level
domains that are in languages using right-to-left characters. Namely the
Unicode bidi class of the characters that make up the TLD is either R or
AL (see UAX 9).
As an IRI must always be rendered left-to-right (see section 2) there
exists a number of cases where an RTL-TLD will render in a way that is
visually unclear what the TLD is in a particular URL. For example:
Logical representation: http://abc.def.GHI/JKL
Visual representation: http://abc.def.LKJ/IHG
In the above case the path appears after the registered domain and is in
the visual location of the TLD. This can confuse the reader as to which is
the actual TLD. In order to restrict such confusing cases the following
rules will apply:
1. An RTL-TLD is a TLD which is in a language where the characters draw
right to left. An LTR-TLD is a TLD which is in a language where the
characters draw left to right.
2. The characters in an RTL-TLD MUST always be of the same Unicode bidi
class.
3. The characters of a registered domain MUST match the Unicode bidi
class of the TLD if the TLD is an RTL-TLD.
4. if the characters of a registered domain contain more than one bidi
class, the domain MUST be registered to an LTR-TLD.
The restriction of MUST guarantees that the registered domain and its
corresponding TLD will always appear together and in the same order in all
possible IRIs. There may be cases where numbers and bidi neutral
characters may be reordered by the Unicode bidi algorithm in a way that
changes their visual position relative to the TLD. The above rules prevent
such cases. If the domain registrar needs to register a name that contains
characters that are mixed direction (e.g. contains numbers, punctuation or
LTR characters) then the domain can still be registered with a TLD that
has left to right characters.
Examples:
A. This is a good case - the TLD is visually followed by the domain:
Logical representation: http://ABC.DEF.GHI/jkl
Visual representation: http://IHG.FED.CBA/jkl
B. With an LTR second level domain there is a sub-optimal case where the
path appears next to the sub-domain. But in this case it is still clear
where the TLD and registered domain are in the IRI:
Logical representation: http://abc.DEF.GHI/JKL
Visual representation: http://abc.LKJ/IHG.FED
Received on Sunday, 6 November 2011 08:26:44 UTC