Re: [css3-fonts][css-variables][css-counter-styles-3][css3-values] Case sensitivity of user-defined identifiers from Jonathan Kew on 2012-10-03 (www-style@w3.org from October 2012)

From: Jonathan Kew <jfkthame@googlemail.com>
Date: Wed, 03 Oct 2012 12:35:12 +0100
To: "Tab Atkins Jr." <jackalmage@gmail.com>
CC: www-style@w3.org
Message-ID: <506C22F0.7040108@gmail.com>

On 2/10/12 20:15, Tab Atkins Jr. wrote:
 > On Tue, Oct 2, 2012 at 4:44 AM, Jonathan Kew 
<jfkthame@googlemail.com> wrote:
 >> Are we comfortable saddling authors with this ASCII-centric weirdness
 >> forever just because of an accident of encoding history?
 >>
 >> IMO, identifiers should either be case-sensitive for everyone (thus 
avoiding
 >> the issue, as per XML), or they should use simple (1:1) 
locale-independent
 >> Unicode case folding. Yes, it's not perfect - e.g. for the Turks and
 >> Lithuanians - but it's simple, predictable, and vastly better and more
 >> inclusive than the ASCII-case-insensitive anachronism.
 >
 > Full case-sensitivity is a non-starter - the fact that we're upgrading
 > some language-defined idents into being user-defined idents (counter
 > style names with @counter-style, property names with Vars, etc.) means
 > that we *must* have at least ASCII-ci, or else the behavior is just
 > plain bizarre.
 >
 > "Quick" unicode case-insensitivity is also full of gotchas.  Sure,
 > Håkon matches HÅKON, but it doesn't match Håkon (a + combining ring),
 > unless we do normalization first as well, which still hasn't been
 > definitively answered.

True; I almost brought up normalization, but was hesitant to open that 
worm-can at the same time. Normalization / canonical equivalence is an 
issue that needs to be addressed somehow regardless of the decision on case.

 > ASCII ci isn't great, but it matches the rest of the platform's
 > behavior, where it's case-sensitive everywhere but the ASCII range.

Yes, this is really an issue facing the platform as a whole; it's wider 
than just a CSS issue.

Are we happy to accept that the Web should embed this Anglo-centric 
weirdness, based on text encoding practices from the last century, into 
its core specifications; or do we want to press for a more inclusive 
platform that aims to treat all languages and writing systems on an 
equal footing for authors, as far as the Unicode encoding model permits?

Should we set a higher bar, understanding that much of the platform will 
not meet it, at least for some time; or do we resign ourselves to the 
current low bar as being the best we can ever hope for?

 > (I wish we could do full case-sensitivity and just make all
 > CSS-defined idents be lowercase, as God intended, but I've seen far
 > too many people write "Red" in their stylesheets to think that this is
 > anywhere near possible.)

Sad, but I expect it's true at this point.

Received on Wednesday, 3 October 2012 11:35:31 UTC