- From: Jonathan Rosenne <rosenne@NetVision.net.il>
- Date: Fri, 18 Oct 1996 00:27:12 +0300
- To: WWW-International List <www-international@w3.org>
- CC: Keld J|rn Simonsen <keld@dkuug.dk>
Keld J|rn Simonsen wrote: > I would rather that you did not normalize, but made a case-independent, > or case-and-accent-independent comparison I suggest doing a case sensitive "canonization": Where two representations are possible, convert to a single one. For example, convert a letter followed by a diacritic to the composed form if such a form exists (or vice versa), convert presentation characters to their base characters, convert wide characters to standard form, etc. and then do the comparison. If I am using software which properly supports European languages, I have no control over the character coding although it would be safe to assume it would use precomposed characters, at least for those which were standardized when the software was written. But if I use software that does not support European languages, for example American, East Asian or Israeli software, I would have to type composite characters and the system would not compose them. Since we are discussing an international environment, we do not know where the user is and what software he is using and we cannot assume it will follow European conventions. -- Jonathan Rosenne JR Consulting P O Box 33641, Tel Aviv, Israel Phone: +972 50 246 522 Fax: +972 9 56 73 53 http://ourworld.compuserve.com/homepages/Jonathan_Rosenne/
Received on Thursday, 17 October 1996 18:42:45 UTC