Re: case sensitivity and the OM

On Dec 17, 2012, at 6:06 PM, L. David Baron wrote:

> On Monday 2012-12-17 16:37 -0800, John Daggett wrote:
>>> From fantasai's comments regarding case sensitivity on WG list:
>> 
>>> 6. If you discuss case-sensitivity, here are my positions:
>>> 
>>> 	a. I am ok with ASCII-insensitivity if it is just
>>> 	   about matching.
>>> 
>>> 	b. I object to ASCII-folding if this is used anywhere
>>> 	   in the OM output as a normalization of author input.
>>> 
>>> 	   In other words, the author must be able to pretend,
>>> 	   as long as unique idents in his mind are
>>> 	   case-insensitively unique, that CSS is
>>> 	   case-sensitive, and have that Just Work.
> 
> I disagree with fantasai's second objection.  I think there are many
> reasons to want author input to be normalized, such as:
> 
>  (1) it reduces the memory usage and performance requirements on
>  implementations; we can store an 8-bit integer for "font-family"
>  instead of the string "font-family".
> 
>  (2) it reduces the risk of code that's sensitive to things it
>  shouldn't be sensitive to.  For example, authors might write
>  javascript that looks like:
> 
>    for (var i = 0; i < declaration.length; ++i) {
>      var prop = declaration[i];
>      if (prop == "font-family") {
>        // do something
>      } else if (prop == "font-weight") {
>        // do something else
>      }
>    }
> 
>  Today, that code works even when the CSS contains "FONT-FAMILY:
>  SANS-SERIF"; changing this would probably break existing Web
>  content.

Right but it fails if the author does something like:

declaration.setProperty("FONT-FAMILY", "serif", "");
declaration.setProperty("FONT-WEIGHT", "700", "");
for (var i = 0; i < declaration.length; ++i) {
  var prop = declaration[i];
  if (prop == "FONT-FAMILY") {
    // do something
  } else if (prop == "FONT-WEIGHT") {
    // do something else
  }
}

I think the best way to handle this is going to require adding an API to do identifier matching via script in whatever the proper way to match that identifier is, something like:

declaration.setProperty("FONT-FAMILY", "serif", "");
declaration.setProperty("FONT-WEIGHT", "700", "");
for (var i = 0; i < declaration.length; ++i) {
  var prop = declaration[i];
  if (CompareCSSIdent(prop, "FONT-FAMILY")) {
    // do something
  } else if (CompareCSSIdent(prop, "FONT-WEIGHT")) {
    // do something else
  }
}

And then the actual comparison function can do whatever the proper matching algorithm expects, be it case-senitivity, ascii case-insensitivity, or some form of unicode insensitivity (and possibly normalization). We can have different functions for each type of matching. This also allows the browser to store only a normalized form.

We should probably also add an API that normalizes the string appropriately, such as:
var ident = NormalizeCSSIdent("FONT-weight");

Peter

Received on Tuesday, 18 December 2012 20:28:24 UTC