W3C home > Mailing lists > Public > public-iri@w3.org > May 2010

RE: [bidi] Special ordering for BIDI URLs

From: Phillips, Addison <addison@lab126.com>
Date: Fri, 28 May 2010 10:37:03 -0700
To: Ted Hardie <ted.ietf@gmail.com>, Slim Amamou <slim@alixsys.com>
CC: Mark Davis ☕ <mark@macchiato.com>, Shawn Steele <Shawn.Steele@microsoft.com>, Adil Allawi <adil@diwan.com>, "public-iri@w3.org" <public-iri@w3.org>, "bidi@unicode.org" <bidi@unicode.org>, Murray Sargent <murrays@exchange.microsoft.com>, "aharon@google.com" <aharon@google.com>, Nasser Kettani <Nasser.Kettani@microsoft.com>
Message-ID: <C7A5719F1E562149BA9171F58BEE2CA4129E2E264A@EX-IAD6-B.ant.amazon.com>
> Obviously I'm not getting the problem out very well, my apologies.
> If
> someone wants a URI like the following:  http://shop45.example/

> but in a RTL context, what do you expect them to register/include
> in
> the DNS zone?
> The "45" here is in the same DNS label as "shop"--no dots are
> present.
> (shop45.franchisestore.example would be a bit more common, but
> there is
> nothing to prevent the inclusion of both numerals and other
> characters in
> many zones).
> The discussion to date has talked about an algorithm that reverse
> the whole
> string. But I'm getting the impression that this is shorthand for
> an algorithm
> that reverse the string except for well-known exceptions like the
> LTR numerals.
> Is that correct?

Hi Ted,

I'm not sure I'm understanding correctly, but I think you're asking, what if you have SHOP45 (where "SHOP" consists of strongly RTL characters)?

The answer here, I think, is that the Unicode Bidirectional Algorithm does the right/expected thing here. You want the RTL characters presented right-to-left while any LTR characters are present left-to-right. A given token might have a very different presentation vs. logical order, but this is expected and not potentially harmful because you cannot spoof within the token (it is token reordering that's a problem).

That is:

   SHOP45.mycompany.com presents as: 45POHS.mycompany.com


  FIRST123SECOND.example.il presents as:  DNOCES123TSRIF.example.il

We do NOT want to reverse the whole string (it isn't 'ptth', it's 'http'). We need to present "runs" in their contextual order but reverse the layout of the runs themselves (or not, as some have contended) in the resultant URI.


Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N, IETF IRI WGs)

Internationalization is not a feature.
It is an architecture.

Received on Friday, 28 May 2010 17:37:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:41 UTC