Message-Id: <9508141605.AA07557@mocha.bunyip.com> Subject: Re: Globalizing URIs To: email@example.com (Masataka Ohta) Date: Mon, 14 Aug 1995 18:04:40 +0200 (MET DST) Cc: firstname.lastname@example.org, email@example.com In-Reply-To: <199508131433.XAA25928@necom830.cc.titech.ac.jp> from "Masataka Ohta" at Aug 13, 95 11:33:28 pm From: Martin J Duerst <firstname.lastname@example.org> Masataka Ohta wrote: >As is proven with passports and airline tickets, 26 Latin characters >are more than enough to represent names internationally. Let us just think a little further along the same line: As is proven with telephone numbers, ten digits are more than enough to address anybody with a telephone around the world. For personal names, the same is easily possible by designing a world-wide system of social security numbers. As is proven by data representation in computers, just two different bit values are enough to represent any data whatsoever. >So, please don't try to solve a non-existent problem. I guess Japanese travelling around the world would be more than happy to have their names in Kanji/Kana on their flight tickets (of course besides the Latin form for the clerks that have to deal with these tickets), to have anouncement boards in foreign airports that show anouncements in Japanese, and even to have anouncements by voice in Japanese. The average Japanese has seen his/her name in Latin letters once in school (when Latin letters are thought), and occasionally for a credit card or passport application. All the daily business is in Kanji, or if those are not available, it is in Kana. Judging from the number of contributors to some Japanese mailing list, there is quite some percentage of Japanese that uses RFC 1522- encoded names in their mail headers, and I guess this percentage would be even higher if there were a more natural implementation. >Hi, Martin. "ASCII" does not mean "English". > >Some of you might be familiar with European environment so that you >might be able to read, recognize, identify, memorize and type in a >Swedish Angstrom character. But, Europe is not the entire world. I do not require that everybody learn Swedish, or the additional character of the Swedish alphabet. And I don't exactly know how to type an Angstrom character on my keyboard (although I could look it up). But I would like to have German names for German documents, and Japanese names for Japanese documents, and so on, and I know that there are quite many Swedish people that would like to use Swedish names for their Swedish documents. And I also know that on Macintoshes and some other computers, this is already easily possible, and heavily used. >To us Japanese, my Japanese name represented with ASCII, that is, >"Masataka Ohta", which is one of a formal notation of Japanese >taught at Japanese elementaly schools, is just fine and better >than "%HH%HH". The notation "%HH%HH" is not so harmful but merely >the second best. Of course you will prefer "Masataka Ohta" to "%HH%HH". But your personal preferences aside, the average Japanese will widely prefer Japanese names (and likewise the names of documents that are in Japanese, and so on) to be in the everyday Kanji/Kana mixture. Next best might be Kana only, and then maybe Latin letters, so that "%HH%HH" may turn to be fourth (if ever considered). And it is true that the representation of Japanese with Latin letters is thought in Japanese schools, but there is not much time spent on this subject, and there is a great chance that the average Japanese, when asked to spell your last name with Latin letters, will spell it Outa or Ota or O-ta (the "-" should go as a bar above the O), but not necessarily Ohta, and show similar problems for other names. Of course, Japanese also have problems sometimes when writing proper names in Japanese, but they will know how to care about this with name cards (many of which don't have Latin letters on the back side) or by introducing themselves as "Ohta, you know, 'Oh' as in 'great, thick', and 'ta' as in 'field'". But most Japanese won't say "and well, in Latin letters, its 'Oh', with 'o'-'h'" (you may be an exception). >And, to non-Japanese, my Japanese name represented with non-ASCII, for >example with ISO-2022-JP encoding: > > ^[$@B@ED!!>;9'^[(B > >might be only a little worse than "%HH%HH". Well, if that name really were in ISO-2022-JP, and not in a form that might show up on a terminal emulator that doesn't deal with Japanese, I would actually see it directly as what it is supposed to represent. So for me and the others that can read Japanese and do have some appropriate software (which includes all those in Japan with computer equipement), it would clearly be more readable than %HH%HH. >The worst case is when you are looking at a URL containing Japanese >characters printed on a paper. > >Can your brain recognize Japanese characters? Leaving the problems of 'brain' and 'mind' to people in AI, I can definitely say that I can recognize and read Japanese, if it is written on paper or properly encoded in electronic mail. >In the international environment, most of you can't read, recognize, >identify, memorize nor type in Japanese characters. > >That is, with the international context, plain ASCII (or ISO 646 >IRV) is the way to go. > >In short, mail addresses and URLs should be pure ASCII. I agree for mail addresses. Mail addersses, at least potentially, can be used from all over the world, and from anybody without considering any language abilities. But for URLs in general, this is different. A Japanese author, writing documents for a Japanese user, should not be forced to make up document names with Latin characters. But with the present URL scheme, (s)he is more or less forced to do so. Regards, Martin.