Re: [CSS21] Case-insensitivity not defined

John Cowan wrote:
> 
>> I'd be happy with that if [a-z] and [A-Z] matched each other and didn't
>> match anything else. But it seems that's not the case in Unicode.
> 
> Well, looking at http://www.unicode.org/Public/5.0.0/ucd/CaseFolding.txt

Yes, that's exactly the reference.

I didn't look at the default case folding of U+0130 (LATIN CAPITAL 
LETTER I WITH DOT ABOVE), which I should have done. For the record, it's:

0130; F; 0069 0307; # LATIN CAPITAL LETTER I WITH DOT ABOVE

This means that WÄ°DTH doesn't match WIDTH or width.

> I find that the basic Latin letters do match each other and nothing
> else, if you ignore the language-specific foldings, with one exception.
> U+212A KELVIN SIGN, which looks exactly like "K" and shouldn't exist
> anyhow (it's compatibility equivalent to a proper "K") is case-folded
> to "k".  I consider that to come under the heading of the Right Thing.

Compatibility characters always present a problem of this sort. I think 
this is also the Right Thing.

> 
> It's also true that some ligatures are case-folded to their spelled out
> equivalents:  for example, U+FB00 LATIN SMALL LIGATURE FF is case-folded
> to simple "ff".
> 

This is actually a Good Thing too.

Addison

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.

Received on Thursday, 15 November 2007 23:18:41 UTC