- From: Keld J|rn Simonsen <keld@dkuug.dk>
- Date: Thu, 17 Oct 1996 20:37:00 +0200
- To: Chris Lilley <Chris.Lilley@sophia.inria.fr>, Jonathan Rosenne <rosenne@NetVision.net.il>, WWW-International List <www-international@w3.org>
Chris Lilley writes: > Another relevant quote from the Unicode standard, on the subject of case > conversion: > > "Because there are many more lowercase forms than there are upper, it is > recommended that the lowercase be used for normalisation rather than the > uppercase, such as when strings are case-folded for loose comparison or > indexing." I see two things here: 1. some characters may only have a lower case form, so converting to upper case is not posssible. Example: German <ss>, Greenlandic <kra>. 2. a number of lower case forms exists where there is only one upper case form, example Greek sigma, where there is a terminal sigma. In the first instance I can see a reason to normalize on lower-case, but in the second case I see problems in chosing which lower case to normalize on. I would rather that you did not normalize, but made a case-independent, or case-and-accent-independent comparison, for example using the functions and tables of the forthcoming ISO sorting standard ISO/IEC 14651. Keld
Received on Thursday, 17 October 1996 14:37:37 UTC