- From: Matitiahu Allouche <matial@il.ibm.com>
- Date: Tue, 2 Jan 2007 15:43:22 +0200
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- Cc: Cary Karp <ck@nic.museum>, idna-update@alvestrand.no, public-i18n-core@w3.org, public-iri@w3.org, public-iri-request@w3.org
Martin Duerst wrote:
<quote>
Checking RFC 3987, I also
found that the text there may need to be clarified (it needs to
be updated to take into account combining marks at the end of
components anyway). [cc: the IRI mailing list]
It currently says:
1. A component SHOULD NOT use both right-to-left and left-to-right
characters.
2. A component using right-to-left characters SHOULD start and end
with right-to-left characters.
I think that at least should be changed to:
1. A component SHOULD NOT use both right-to-left and left-to-right
letters.
2. A component using right-to-left characters SHOULD start and end
with right-to-left letters.
<end of quote>
While I fully agree with Martin's intent, I am not sure that the proposed
changed text accomplishes its purpose.
First, changing the wording of rule 1 from "characters" to "letters"
allows mixing in the same component LTR and RTL characters which are not
letters, leading to confusion for readers who are not expert in the
Unicode Bidirectional Algorithm.
Secondly, the new wording of rule 2 does not allow combining marks at the
end of components. I suggest the following.
a) Leave rule 1 as is:
1. A component SHOULD NOT use both right-to-left and left-to-right
characters.
b) Change rule 2 as follows:
2. A component using right-to-left characters SHOULD start with a
right-to-left letter and end with a right-to-left letter
optionally followed by combining marks.
Shalom (Regards), Mati
Bidi Architect
Globalization Center Of Competency - Bidirectional Scripts
IBM Israel
Phone: +972 2 5888802 Fax: +972 2 5870333 Mobile: +972 52
2554160
Received on Tuesday, 2 January 2007 13:43:43 UTC