Re: Unicode normalization in CSS

Leif Halvard Silli, Mon, 11 Apr 2011 17:03:03 +0200:
> Daniel Glazman, Mon, 11 Apr 2011 16:23:14 +0200:
>> Le 11/04/11 16:11, Leif Halvard Silli a écrit :

> Fantasai said she was in lack of conclusion. Since she is a spec 
> editor, I offer the conclusion that we could document the issues. And 
> by, "issues", I mean summarize/list the affected letters. And also to 
> describe *when* it is likely to be an issue, such as when linking to 
> files - which for instance affect a:visited{}.

There is a file at www.UNICODE.org which I believe lists the affected 
characters. [1] Expressing the essence of that list, in light of CSS, 
with some humanly readable words, would be a good start. (And perhaps 
creating a tool by which the author can check if his/her letters are 
affected.) 

The list e.g. contains 81 instance of "LATIN CAPITAL LETTER A WITH". 
But only 4 instances of "LATIN CAPITAL LETTER F WITH". So vocals are 
more frequently affected than consonants ... 

And 16% of the letters on that list are mathematical signs. Seldom 
found in file names.

If there existed a hypertext version of that list, with some logical 
ToC etc, then CSS specs could just list to that file instead of 
describing things itself.

The problem description "normalization" is too huge. It must be broken 
down to concrete issues that really matter. E.g. even if there are 81 
capital letter A with a diacritic, the concrete author often only has 
to deal with two (that is: one upper/lower case letter) per 
text/language.

[1] http://unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt
-- 
leif halvard silli

Received on Monday, 11 April 2011 16:00:30 UTC