Date: Tue, 6 May 1997 21:20:46 +0200 (MET DST) From: "Martin J. Duerst" <firstname.lastname@example.org> To: "Alain LaBont/e'/" <email@example.com> cc: URI mailing list <firstname.lastname@example.org> Subject: Re: "Difficult Characters" draft In-Reply-To: <email@example.com> Message-ID: <Pine.SUN.3.96.970506210330.245U-100000@enoshima> On Tue, 22 Apr 1997, Alain LaBont/e'/ wrote: > From a *real user*'s point of view what you say is disconcerting. In fact > it does not correspond to a reality I exeperience every day. My insurance > agent gave me his personal URL last week, for example, URL in which there > were uppercase letters that were transformed into lower case when Netscape > displayed the actual URL and in searching with both forms it is allright... Well, I just tried the URL, and my Netscape didn't do any lowercasing. But that's a detail. > Hence in this actual concrete example, > > http://www.LaMutuelle.com/agent/home.htm?aid=S200569 and > http://www.lamutuelle.com/agent/home.htm?aid=S200569 > > are totally equivalent. Changing those habits would not be desirable. These are indeed totally equivalent. But try to write > http://www.LaMutuelle.com/Agent/home.htm?aid=S200569 or > http://www.lamutuelle.com/agent/Home.htm?aid=S200569 and you will get a nasty error (all in English, with a pointer to http://www.themutualgroup.com/). Some exceptions and surprises to the contrary nonewithstanding, an uninformed user has to be tought to copy an URL as is, including case. A more informed user may know about parts of an URL that can be changed in capitalization. Actually, you can write > http://wWw.lAmUtUeLlE.CoM/agent/home.htm?aid=S200569 and it will still work. But please leave the part after the first single slash alone. > In French at least, case doesn't have in general the importance that has in > German, for example. For accented and unaccented data, of course minimally > a lower case accented letter should be equivalent to the upper case > counterpart, but even in lower case, it is desirable that an unaccented > letter be equivalent to its accented counterpart (an actual case is that it > is processed like this since 1981 in DOS on a PC) for searching purposes. If a lowercase accented letter appears in the later part of an URL, it won't be equivalent to the corresponding uppercase letter because there is also no equivalence for nonaccented letters. In case there is indeed equivalence, as we currently have it in domain names, it will be the task of domain name internationalization to decide what to do about it, whether to make the usual domain names case sensitive or whether to introduce case eqivalences for characters outside ASCII or whatever. There is no problem with any kind of URL scheme or mechanism to introduce additional eqivalences where they see fit, but we can't introduce them for all URLs. > What I suggest is that searching be done according to the same spirit as > ISO/IEC CD 14651 which deals with such equivalences. At the limit (this > does not have an influence on URLs but it should be considered) in > searching URLs, expectations could be built on LOCALEs... that is what I > suggest. I full agree for searching. However, what is done usually with URLs is not searching. It is binary matching. Only things that are absolutely binary equivalent (after the last step in your sorting standard) match. The normalization procedures in the draft only increase the level a tiny bit, to avoid those cases where the binary representation is different, but the user has absolutely no chance to make a difference. > For example as was explained, o and ö are not equivalent in Swedish (while > they are in German), They are definitely not! Otherwise, we wouldn't need the ö :-). It's only that we don't consider ö a letter of its own, but that doesn't mean a German wouldn't be able to know where to put an o and where to put an ö in an URL (with the exception of those cases where both possibilities make sense and where it is all the more important to make the difference :-). > n and ñ are not equivalent in Spanish while they are > in French and so on. That has no impact per se on the making of URLs, but > it has one on their use, that was the only consideration I was trying to > suggest. I agree that it should have an inpact on the use in searching and such. But that's not the main function of URLs. Regards, Martin.