W3C home > Mailing lists > Public > public-iri@w3.org > May 2010

Re: Special ordering for BIDI URLs

From: John C Klensin <john-ietf@jck.com>
Date: Tue, 25 May 2010 12:16:11 -0400
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Mark Davis ☕ <mark@macchiato.com>
cc: public-iri@w3.org, bidi@unicode.org, Shawn Steele <Shawn.Steele@microsoft.com>, Murray Sargent <murrays@exchange.microsoft.com>, aharon@google.com
Message-ID: <A3850983ECE38767669F23AA@PST.JCK.COM>

--On Tuesday, May 25, 2010 19:47 +0900 "\"Martin J. Dürst\""
<duerst@it.aoyama.ac.jp> wrote:

> [I'm not sure what the IP address has to do in a discussion on
> schemes, I'll comment on the scheme only here.]

Because, depending on how one reads/interprets it, an IP address
that we know as (net 12, etc.) could appropriately
be written that way or as (perhaps in some
appropriate non-European digits, so, e.g., ٧٨.٥٦.٣٤.١٢ ).
> In a short hallway discussion at the Hiroshima IETF, John
> Klensin and me discussed the possibility of allowing non-ASCII
> scheme names, but strictly limiting these to RTL scripts in
> practical use. If such a limitation were politically
> acceptable, it would provide a means to make RTL IRIs more
> consistent while avoiding an explosion of scheme identifiers.

I have my doubts (see below).  And, for the record, I intended
to discuss it only in the context of situations/ perceived
problems that IRIs would not solve.

> However, we were both skeptical about the political
> feasibility; once there are Arabic and Hebrew (and Syriac and
> Thaana and Dhivehi) identifiers, it's easy to imagine that
> others will want Greek and Cyrillic and Chinese and Korean and
> so on and so on and cry foul if they don't get it. That would
> explode the space of scheme identifiers.


> It should be clear that allowing scheme identifiers per
> language would be going totally over board. It would be one
> transcription for Arabic (script), not one for Arabic
> (language), one for Urdu, one for Persian, and so on. This is
> how it has worked with Latin schemes up to now, http works for
> English, French, Spanish, German, Italian,... and many more
> languages.

Well, "worked" is sometimes in the mind of the beholder.  "http"
works because people have accepted that as the fixed mnemonic
identifier for the scheme type.   It is perhaps accepted in
those languages because no one saw an alternative and because it
isn't an obvious "word" in any of them.  But, if one starts
moving down the   Greek, Cyrillic, Chinese, Korean,... paths (in
some order), then I think it nearly certain that the users and
partisans of some language that uses Latin Script will insist on
a form of the scheme name that is more compatible with their
particular language.

To repeat what I almost certainly said in Hiroshima, one could
deal with any of these things as a localization matter (which is
exactly how scheme names and popular labels like "www" are dealt
with in some Asian contexts today) but, as soon as one admits to
that possibility, one opens up the entire set of issues
surrounding localized identifiers (rather than global,
one-size-or-form-fits-all ones).  And, as soon as one concludes
that localized identifiers are rational, there is immediately
some doubt as to whether one wants IRIs with a single global
mapping algorithm at all.

Received on Tuesday, 25 May 2010 16:16:58 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:41 UTC