case sensitivity and the OM from John Daggett on 2012-12-18 (www-style@w3.org from December 2012)

From: John Daggett <jdaggett@mozilla.com>
Date: Mon, 17 Dec 2012 16:37:34 -0800 (PST)
To: www-style list <www-style@w3.org>
Message-ID: <5327382.6797238.1355791054905.JavaMail.root@mozilla.com>

>From fantasai's comments regarding case sensitivity on WG list:

> 6. If you discuss case-sensitivity, here are my positions:
> 
>  a. I am ok with ASCII-insensitivity if it is just
>     about matching.
> 
>  b. I object to ASCII-folding if this is used anywhere
>     in the OM output as a normalization of author input.
> 
>     In other words, the author must be able to pretend,
>     as long as unique idents in his mind are
>     case-insensitively unique, that CSS is
>     case-sensitive, and have that Just Work.

This already happens, this is why <mar[kelvin]> ends up in the DOM as
<mark> [1].

It's an artifact of ASCII case insensitivity not being defined
precisely.  Huh, you say?  Seems like a no-brainer but ASCII case
insensitivity can be defined as either (1) lowercase the characters
[A-Z] in both strings and compare characters or (2) map all Unicode
characters that have lowercase mappings in the ASCII range to their
lowercase mappings in both strings and compare characters.

The latter definition leads to all sorts of funky (silly?) edge case
behavior, like the the kelvin sign or dotted capital I matching ASCII
characters in tagnames.  I think the original intent of ASCII case
insensitivity was more likely (1).  The latter algorithm (2) is
probably more an artifact of library usage or someone trying to be
"complete" in their string library implementation.

Cheers,

John Daggett

[1] http://people.mozilla.org/~jdaggett/tests/casesensitivity-tagnames.html

Received on Tuesday, 18 December 2012 00:38:06 UTC