- From: Matitiahu Allouche <matial@il.ibm.com>
- Date: Sun, 6 Nov 2011 10:25:32 +0200
- To: Adil Allawi <adil@diwan.com>
- Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "public-iri@w3.org" <public-iri@w3.org>
- Message-ID: <OF2CBA8F1B.87F196A5-ONC2257940.002C8366-C2257940.002E50A6@il.ibm.com>
Hello, Adil! I tend to like your proposal, but I have not yet formed a definitive opinion. However I can already comment that your third rule seems to me too restrictive. You wrote: 3. The characters of a registered domain MUST match the Unicode bidi class of the TLD if the TLD is an RTL-TLD. Since your rules 1 and 2 constrain an RTL-TLD to contain only characters with bidi class R or only with bidi class AL, this rule forbids domain names such as "ABC-DE" although hyphen is allowed even in LDH labels. I think that rule 3 should be relaxed to allow innocuous characters to appear at innocuous locations (inside a label and not at its ends). I leave it to you to define what are "innocuous characters". At first glance, I would say anything except characters with bidi class L, but this needs some more reflection. Shalom (Regards), Mati Bidi Architect Globalization Center Of Competency - Bidirectional Scripts IBM Israel Mobile: +972 52 2554160 From: Adil Allawi <adil@diwan.com> To: "public-iri@w3.org" <public-iri@w3.org> Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp> Date: 05/11/2011 23:24 Subject: Bidi IRI with a Bidi TLD Dear all, Following is my suggestion for a new section in the bidi iri document. This is to improve useability. The main point of this proposal is that the domain and TLD always appear together in a URL so that a user can read, enter, highlight and copy it. Also, so that a user looking at a bi-di URL will always recognize the domain part. The restriction I propose is only for the main domain eg. the "google" in "google.com" not the subdomain e.g. the "translate" in "translate.google.com". That is name registered with the domain registrar. Adil ------ Restrictions on domain names for Top Level Domains (TLDs) Definition: Right-To-Left Top Level Domains (RTL-TLD). These are top-level domains that are in languages using right-to-left characters. Namely the Unicode bidi class of the characters that make up the TLD is either R or AL (see UAX 9). As an IRI must always be rendered left-to-right (see section 2) there exists a number of cases where an RTL-TLD will render in a way that is visually unclear what the TLD is in a particular URL. For example: Logical representation: http://abc.def.GHI/JKL Visual representation: http://abc.def.LKJ/IHG In the above case the path appears after the registered domain and is in the visual location of the TLD. This can confuse the reader as to which is the actual TLD. In order to restrict such confusing cases the following rules will apply: 1. An RTL-TLD is a TLD which is in a language where the characters draw right to left. An LTR-TLD is a TLD which is in a language where the characters draw left to right. 2. The characters in an RTL-TLD MUST always be of the same Unicode bidi class. 3. The characters of a registered domain MUST match the Unicode bidi class of the TLD if the TLD is an RTL-TLD. 4. if the characters of a registered domain contain more than one bidi class, the domain MUST be registered to an LTR-TLD. The restriction of MUST guarantees that the registered domain and its corresponding TLD will always appear together and in the same order in all possible IRIs. There may be cases where numbers and bidi neutral characters may be reordered by the Unicode bidi algorithm in a way that changes their visual position relative to the TLD. The above rules prevent such cases. If the domain registrar needs to register a name that contains characters that are mixed direction (e.g. contains numbers, punctuation or LTR characters) then the domain can still be registered with a TLD that has left to right characters. Examples: A. This is a good case - the TLD is visually followed by the domain: Logical representation: http://ABC.DEF.GHI/jkl Visual representation: http://IHG.FED.CBA/jkl B. With an LTR second level domain there is a sub-optimal case where the path appears next to the sub-domain. But in this case it is still clear where the TLD and registered domain are in the IRI: Logical representation: http://abc.DEF.GHI/JKL Visual representation: http://abc.LKJ/IHG.FED
Received on Sunday, 6 November 2011 08:26:44 UTC