Re: Globalizing URIs

Masataka Ohta (mohta@necom830.cc.titech.ac.jp)
Sun, 13 Aug 95 23:33:28 JST


From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
Message-Id: <199508131433.XAA25928@necom830.cc.titech.ac.jp>
Subject: Re: Globalizing URIs
To: mduerst@ifi.unizh.ch (Martin J Duerst)
Date: Sun, 13 Aug 95 23:33:28 JST
Cc: uri@bunyip.com
In-Reply-To: <9508071704.AA14694@mocha.bunyip.com>; from "Martin J Duerst" at Aug 7, 95 7:04 pm

> >You may wish to complain that it is English-centric that the
> >least-common-denominiator can represent English names better than
> >Swedish ones, but that is not a problem that the IETF can solve.

Agreed.

As is proven with passports and airline tickets, 26 Latin characters
are more than enough to represent names internationally.

So, please don't try to solve a non-existent problem.

> If it is in the form %HH%HH, with no indication of what the octets are
> meaning, then Swedish or Japanese don't get represented worse than
> English, they don't get represented at all!

Hi, Martin. "ASCII" does not mean "English".

Some of you might be familiar with European environment so that you
might be able to read, recognize, identify, memorize and type in a
Swedish Angstrom character. But, Europe is not the entire world.

To us Japanese, my Japanese name represented with ASCII, that is,
"Masataka Ohta", which is one of a formal notation of Japanese
taught at Japanese elementaly schools, is just fine and better
than "%HH%HH". The notation "%HH%HH" is not so harmful but merely
the second best.

And, to non-Japanese, my Japanese name represented with non-ASCII, for
example with ISO-2022-JP encoding:

	^[$@B@ED!!>;9'^[(B

might be only a little worse than "%HH%HH".

The worst case is when you are looking at a URL containing Japanese
characters printed on a paper.

Can your brain recognize Japanese characters?

In the international environment, most of you can't read, recognize,
identify, memorize nor type in Japanese characters.

That is, with the international context, plain ASCII (or ISO 646
IRV) is the way to go.

In short, mail addresses and URLs should be pure ASCII.

							Masataka Ohta