- From: Keld J|rn Simonsen <keld@dkuug.dk>
- Date: Thu, 7 Mar 1996 07:43:45 +0100
- To: Larry Masinter <masinter@parc.xerox.com>
- Cc: martin@terena.nl, wg-i18n@terena.nl, uri@bunyip.com
Larry Masinter writes: > While in ASCII you can define 'case independent match' by > performing 'translate to upper case and then use string equality', > this does not work for other character repertoires, e.g., JIS might > have separate codes for single and double-wide codes yet want to treat > them equivalent for matching. > > While uppercase mapping is culturally sensitive, can we not make a > culturally independent 'character matching' algorithm that is good > enough for directory services. Perhaps it means treating accented and > unaccented versions of French initial capitals equivalent, even though > this equivalence is not determined by 'canonicalization'? > ISO/IEC JTC1/SC22/WG20 is producing a sorting/comparison standard that may be used for this purpose. It has a number of levels that the comparison may be done at, for exmaple level 1 would equivalence all "A"s and the level could also equivalence single and double- width encodings (of the latin letters, mostly). The standard is ISO 14651 now appearing as WD3 and going to CD stage in May 1996. Keld
Received on Thursday, 7 March 1996 01:45:42 UTC