Message-Id: <email@example.com> Date: Fri, 11 Aug 1995 18:33:20 -0700 To: firstname.lastname@example.org From: email@example.com (Paul Hoffman) Subject: Re: Globalizing URIs Comments on many people's responses: - Karen Sollins brings up many good points about the history of RFC 1737. Basically, using meaning in a name greatly increases the chance that that name will become invalid or changed in the future. It is a general desire for URL names to be as persistent as possible. Having said that, almost no one is paying attention. Look at the URLs on a couple of randomly-selected pages from Yahoo or any of the WWW Virtual Libraries. At least 80% of them have plenty of meaning. Even the non-English ones look like they have meaning; even though I don't speak Italian, most of the ones in the .it domain seem to have a mixture of consonants and vowels that look like Italian language to me. - Keith Moore brings up what I consider to be a very, very strong argument against client software showing meaning in a URL that is different than the characters in the URL, which is what we are really discussing here. Basically, if you show one thing that really means another, the user will very likley try to transcribe the wrong one, particularly if there is no way for the user to know that a transformation has taken place. Even with smart copying (select a converted URL, copy it, and the copy is the unconverted one), a user looking at the screen will write down an incorrect transcription. - Martin Duerst did us a big favor by laying out the proposed encodings: A1) <[ISO-8859-1]http://xxx.yyy.zz/AA/BB/CC.html> A2) <http:[ISO-8859-1]//xxx.yyy.zz/AA/BB/CC.html> A3) <http://xxx.yyy.zz/[ISO-8859-1]AA/BB/CC.html> A4) <http://xxx.yyy.zz/AA/BB/CC.html[ISO-8859-1]> A5) <http://xxx.yyy.zz/AA/BB/CC.html;ISO-8859-1> I note that A1 and A1 would break every client in use today. Further, I *hope* no one is expecting their domain names to have meaning taken from them and shown to a client in a different form than they appear. A3 would be the easiest for Web server administrators to implement with today's Web server software (I can't speak for FTP or Gopher servers). A4 would break the (admittedly dumb, but common) Web browsers that look at the end of the URL to see what "kind" of file it is getting. In summary, I feel that Keith Moore has brought up the most salient point: if we show the user an alternate view of a URL, that will lead to endless confusion when they decide to do anything other than select that URL in the single browser they are running at the moment. Yes, we are all (well, almost all) putting meaning into our URLs today using the very non-international character set given to us in RFC 1738. Yes, this is dumb because even if you understand my character set, you may not understand my langauge, and thus will not get any meaning from the language-specific part. Giving you better access to my desired character set will only help you if you also understand my language, and it will also introduce a large number of other problems.