Re: Case Sensitivity in CSS [I18N-ACTION-171]

Le 15/01/2013 17:59, Phillips, Addison a écrit :
> Case Insensitive comparison: Where CSS cannot be case-insensitive
> for legacy reasons or for implementation choice reasons, the I18N WG
> recommends that comparison be done using Unicode "common" plus
> "full" case fold mapping, as we previously recommended. […] we have
> confirmed with our Unicode colleagues that this is the right
> approach [4].
>
> [4]https://lists.w3.org/Archives/Member/member-i18n-core/2013Jan/0003.html


"Common" plus "full" case fold mapping. I’m not expression an opinion 
for or against this here, but I was confused as to what it means 
exactly. In various Unicode documents, one can read about "default", 
"simple", "special", "NFKC" case folding. How do these relate to 
"common" and "full"?

The answer seems to be in [4], but that link is Member-only. I took the 
liberty of copying the relevant part here for everyone to see:

> For reference,
> full case mappings is defined in "3.13  Default Case Algorithms" of TUS
> chapter 3. (eg http://www.unicode.org/versions/Unicode6.2.0/ch03.pdf)
>
> The full case mappings for Unicode characters are obtained by using the
> mappings from
> SpecialCasing.txt  plus the mappings from UnicodeData.txt, excluding any of
> the latter
> mappings that would conflict. Any character that does not have a mapping in
> these files is
> considered  to  map  to  itself.  The  full  case  mappings  of  a
>  character  C  are  referred  to  as
> Lowercase_Mapping(C), Titlecase_Mapping(C), and Uppercase_Mapping(C). The
> full
> case folding of a character C is referred to as Case_Folding(C).
> ...
> R4 toCasefold(X): Map each character C in X to Case_Folding(C).
>
> • Case_Folding(C) uses the mappings with the status field value “C” or “F”
> in the
> data file CaseFolding.txt in the Unicode Character Database.


I’ll still need a more careful examination to know how to implement it, 
or to decide if Python’s casefold() method is the same:

http://docs.python.org/3.3/library/stdtypes.html#str.casefold

-- 
Simon Sapin

Received on Wednesday, 16 January 2013 11:31:16 UTC