- From: Matitiahu Allouche <matial@il.ibm.com>
- Date: Tue, 2 Jan 2007 15:43:22 +0200
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- Cc: Cary Karp <ck@nic.museum>, idna-update@alvestrand.no, public-i18n-core@w3.org, public-iri@w3.org, public-iri-request@w3.org
Martin Duerst wrote: <quote> Checking RFC 3987, I also found that the text there may need to be clarified (it needs to be updated to take into account combining marks at the end of components anyway). [cc: the IRI mailing list] It currently says: 1. A component SHOULD NOT use both right-to-left and left-to-right characters. 2. A component using right-to-left characters SHOULD start and end with right-to-left characters. I think that at least should be changed to: 1. A component SHOULD NOT use both right-to-left and left-to-right letters. 2. A component using right-to-left characters SHOULD start and end with right-to-left letters. <end of quote> While I fully agree with Martin's intent, I am not sure that the proposed changed text accomplishes its purpose. First, changing the wording of rule 1 from "characters" to "letters" allows mixing in the same component LTR and RTL characters which are not letters, leading to confusion for readers who are not expert in the Unicode Bidirectional Algorithm. Secondly, the new wording of rule 2 does not allow combining marks at the end of components. I suggest the following. a) Leave rule 1 as is: 1. A component SHOULD NOT use both right-to-left and left-to-right characters. b) Change rule 2 as follows: 2. A component using right-to-left characters SHOULD start with a right-to-left letter and end with a right-to-left letter optionally followed by combining marks. Shalom (Regards), Mati Bidi Architect Globalization Center Of Competency - Bidirectional Scripts IBM Israel Phone: +972 2 5888802 Fax: +972 2 5870333 Mobile: +972 52 2554160
Received on Tuesday, 2 January 2007 13:43:43 UTC