W3C home > Mailing lists > Public > public-iri@w3.org > May 2010

RE: Special ordering for BIDI URLs

From: John C Klensin <john-ietf@jck.com>
Date: Tue, 25 May 2010 14:39:28 -0400
To: Jonathan Rosenne <rosennej@qsm.co.il>, 'Slim Amamou' <slim@alixsys.com>
cc: "'Mark Davis ?'" <mark@macchiato.com>, public-iri@w3.org, bidi@unicode.org, 'Shawn Steele' <Shawn.Steele@microsoft.com>, 'Murray Sargent' <murrays@exchange.microsoft.com>, aharon@google.com
Message-ID: <7F8FEE8BA5A622042AA797C4@PST.JCK.COM>


--On Tuesday, May 25, 2010 13:51 +0300 Jonathan Rosenne
<rosennej@qsm.co.il> wrote:

> It certainly is a misunderstanding. A kid in Egypt or Israel
> who has not yet learnt a second language should be able to use
> the internet in his own language and script, i.e. exclusively
> RTL.

Jony (and others),

In principle, I agree.

In practice, this opens up several groups of problems.  I do not
expect us to reach agreement on solutions (or even whether
solutions are needed), but I think it would be helpful if we
could agree on the nature of the problems / difficulties.  I've
got a bias about the right answer -- almost everyone who has
thought about the issues does even though their/our conclusions
differ -- but I'm going to try to write what follows as
neutrally as I can.

(1) One can optimize for identifiers (including, but not limited
to URIs/ IRIs) that make good intuitive sense for people without
much computer sophistication and without a global perspective.
I assume you "kid ... who has not yet learnt a second language"
would fall into that category, but I think it is broader than
just those kids.    Doing that optimization implies identifiers
that are not globally usable, at least for other people of the
same type but from different cultures, since conventions and
assumptions differ.  And, of course, normally RotL environments
aren't the only issue.  Some would argue that matching of
Simplified and Traditional Chinese; US and British spelling of
English; matching of Kana and Kanji or Hangul and Hanji spelling
of strings; matching Eastern Arabic-Indic, Arabic-Indic, and
European digits; and so on are equivalent problems in which an
unsophisticated user may have different (but entirely reasonable
to themselves) expectations from someone with a better
understanding of how things work.

(2) One can optimize for globally-useful identifiers.  Doing so
makes export of identifiers from one environment to another much
easier and more obvious.  It makes it far easier to construct
search engines that work globally, browsers and other
applications software that are largely locale-independent, and
so on.  By requiring that the same identifiers work everywhere,
it makes it far easier for people who travel to faraway places
and borrow machines or use local kiosks to access the Internet.
In some contexts, those advantages are probably more about
"possible" than they are about "easier".  But the price is that
things require more learning and become a lot less intuitive for
much of the world's population, including all of those who are
the greatest beneficiaries of the first optimization.  For
historical reasons (at least), the further one's language or
writing system are from Western European Latin-based forms, the
less intuitive and more difficult the obvious global identifiers
are likely to seem (although I was recently told, quite
convincingly, that we could solve many of our problems by
changing our global identifier script from Basic Latin to
Hangul).

(3) It is not clear that there is a middle ground.  Certainly it
is hard to deduce one from the positions taken by the passionate
advocates of one or the other of the optimizations above.  Some
of those who think there is such a position say things about
global identifiers that are not routinely seen by end user and
that can be localized by some sort of layering mechanism.  While
several such proposals have been sketched out, none have gained
traction, in part because the one thing the advocates of the two
optimizations above usually agree on is that they don't like
such middle grounds.

The problem is very hard and I've gradually gotten pessimistic
about whether real progress is possible (at least before things
get worse).  But it has become clear to me that the difference
between those first two optimizations rests on rather
fundamental philosophical assumptions and that trying to
persuade people from one camp of the rightness of the positions
of the other by citing the needs of children, people without
Latin characters on their keyboards, or the horrors of a world
in which some URIs/IRIs (or some email addresses, etc.) are
inaccessible to lots of people is not working well... or at all.

best,
   john
Received on Tuesday, 25 May 2010 18:40:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:57 GMT