[Prev][Next][Index][Thread]

Re: Internationalized CLASS attributes



Chris Lilley writes:

> On Oct 17,  8:37pm, Keld J|rn Simonsen wrote:
> 
> > I would rather that you did not normalize, but made a case-independent,
> > or case-and-accent-independent comparison,
> 
> Sorry, could you eplain how a case-independent comparison differs from case
> folding (or normalization) ?

My understanding of normalization is that you convert a string to
a normalized representation, by for example converting all upper case
characters to lower case. When doing case-independent comparison
you compare the two strings by finding weights for each
character in the strings and then comparing the weights. 

You may say that the weights are a kind of normalization, and
implementations may actually convert the strings into weight strings
for more efficient comparison.

The weights are then on one level equal for all accented variants
of a base letter, or on another level equal for all letters with the
same base letterr and a certain accent regardless of case.
> 
> > for example using the functions and tables of the forthcoming ISO
> > sorting standard ISO/IEC 14651.
> 
> Thanks for the reference. Are these tables available online?

You can get the latest version (at CD registration stage) in
http://www.dkuug.dk/JTC1/SC22/WG20/docs/65S14651.doc

There are tables in there, but maybe not so eaisly usable.
A set of POSIX locales are available in ftp://dkuug.dk/i18n/WG15-collection/

Keld