- From: John C Klensin <klensin@mci.net>
- Date: Tue, 15 Apr 1997 11:55:43 -0400 (EDT)
- To: Dan Oscarsson <Dan.Oscarsson@trab.se>
- Cc: Harald.T.Alvestrand@uninett.no, uri@bunyip.com, fielding@kiwi.ICS.UCI.EDU
On Tue, 15 Apr 1997 15:50:11 +0200 (MET DST) Dan Oscarsson <Dan.Oscarsson@trab.se> wrote: >... > Well, Swedish letters like едц are normally called Latin, but I assume you > mean ascii. I can't speak for Roy, but, in my earlier note on the subject, I meant *Latin*. The reality is that UTF-8 is "user friendly" --and will get through a lot of systems without either advanced planning or difficulties-- if the character set that is actually in use is ISO 8859-1, not just ASCII. It isn't too bad for the other Latin alphabets. But for the character collections that are distinctly not Latin-based, the display resulting from the use of UTF-8 in the absence of the sort of aggressive, front-end, "everyone needs to apply it" translations that Roy suggested are not only not user-friendly, but closely approximate a secret code (worse than %-notation or the notorious Q-P). If one looks ahead more than a year or so and assumes worldwide use of the Internet, there are more of "them" than there are of "us" and the marginal fraction of the population that considers 8859-1 (and hence UTF-8) to be user-friendly as compared to ASCII is, unfortunately, barely worth the trouble. It would have been better had URLs been carefully and thoughtfully internationalized from the very beginning. For whatever reasons, they weren't. A conversion now is going to be painful. But, if the pain is worth it, and I suspect it might be, then let's look to a balanced, equitable, *international* solution, not using UTF-8 encoding in the hope that no one who uses ideographic characters will be bothered about what happens to them. > If we cannot find a way to send URLs containing any character in a way so > that the characters can be understood and displyed in a user friendly > manner, the web and URLs are not the future. I completely agree with this. However, I think we need to adopt a very broad understanding of "user friendly" as well as keeping in mind that, for intersystem protocol purposes, ASCII, -- or even the stable subset of ISO 646 / T.50 -- have a much more successful track record (in both the IETF and ISO/ITU arenas) than any of the many attempts at "national", "localized", "international", or "universal" character sets. john
Received on Tuesday, 15 April 1997 11:56:00 UTC