W3C home > Mailing lists > Public > public-iri@w3.org > September 2003

Re: Bidi: is stringprep broken?

From: Roy Badami <roy@gnomon.org.uk>
Date: Mon, 8 Sep 2003 11:26:29 +0100
Message-ID: <16220.22869.152563.294583@moriarty.gnomon.org.uk>
To: "Matitiahu Allouche" <matial@il.ibm.com>
Cc: Roy Badami <roy@gnomon.org.uk>, ietf-imaa@imc.org, public-iri@w3.org

Matitiahu Allouche writes:
 > According to my understanding, and to testing against the Unicode C 
 > reference implementation, you are correct in stating that the 2 strings ("A-123,456B" and "A456,-123B") will give the same display according to the Unicode algorithm for 
 > Bidirectional text.

Thanks for verifying it.  Though it's still possible that I'm mistaken
about it passing nameprep.

 > You will admit that your example is more than a little contrived. 

Yes, and it's probably unlikely ever to be registerable, since it
involves punctuation (and not only that, but punctuation associated
with the wrong script).

My other example (that ARAB.123.com and 123.ARAB.com render the same)
worries me more.

 > I can find other examples of names allowed by the rules which can
 > mislead users trying to induce the logical order based on the
 > display order.  All of these examples are quite bizarre.

I'd be interested in the examples you have.

 > By the way, can you give a reference to "UseSTD13ASCIIRules", for
 > an ignoramus like myself?

RFC3490.  When the UseSTD13ASCIIRules flag is set, ASCII characters
other than alphanumerics and HYPHEN-MINUS are prohibited, in
accordance with traditional hostname rules.  Hence my use of ARABIC
COMMA; I needed a character that was a number separator, was
non-ASCII, and didn't have a compatibility decomposition (since IDNA
uses NFKC).

Received on Monday, 8 September 2003 06:26:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:30 UTC