Date: Tue, 6 May 1997 11:56:45 +0200 (MET DST) From: "Martin J. Duerst" <firstname.lastname@example.org> To: "Alain LaBont/e'/" <email@example.com> cc: URI mailing list <firstname.lastname@example.org> Subject: Re: "Difficult Characters" draft In-Reply-To: <email@example.com> Message-ID: <Pine.SUN.3.96.970506111326.245L-100000@enoshima> On Mon, 5 May 1997, Alain LaBont/e'/ wrote: > [Martin] : > >- We are dealing with identifiers, and assuming precise matching up > > to the precision a human reader familiar with the script > > is able to handle. In this respect, discussions about > > unprecise searches are irrelevant. > > [Alain] : > Really? Due to historical reasons (fortunately or not, some systems > transform accented letters into their non-accented forms and this is also a > requirement for searches in French, maybe in German too btw), that might be > quite relevant. We are dealing with identifiers and matching. Maybe I have to make this clearer in my draft. As an example, significant parts of an URL now distinguish upper case and lower case. Either you get it right, or the URL is not found. That's what an identifier is for. Searching via web search services, directories, and so on is not our concern. > I may be wrong, but it might be also that bad habits formed > expectations about unprecise searches. Do you mean that here we mean really > precice seraches in which even case shall be used as is? Definitely. That's what happens today with URLs. The intent of the document is not to define equivalences for search, but to define normalization at the source so that we can use the binary comparison of existing software. > That would really > be misleading for French-speaking users (I talk by experience, having done > such tests by accident in an international audience). ASCII web users have learned that they have to take care about case in URLs. ASCII URL creators have learned that they, too, have to take care about case in URLs, in order to make it easy for the users. Beyond-ASCII users and URL creators will have to learn similar things with respect to case and with respect to other stuff, such as accents. French URL users may have to learn that on uppercase URLs, they should not drop accents that they see. French URL creators may have to learn that they better not create uppercase accented characters in their URLs in order to not disturb their users. One of these things, or both, may end up in the current draft. What would you suggest? Regards, Martin.