case sensitivity of user-defined idents [was: [CSSWG] Minutes Telecon 2013-01-09] from John Daggett on 2013-01-11 (www-style@w3.org from January 2013)

From: John Daggett <jdaggett@mozilla.com>
Date: Thu, 10 Jan 2013 18:40:09 -0800 (PST)
To: www-style@w3.org
Message-ID: <1594015727.10423763.1357872009491.JavaMail.root@mozilla.com>

[argh, forgot to change the subject, reposting, so sorry!!]

>From the minutes of the 9 Jan 2013 CSS WG telcon:

> fantasai: the key question to me is: can an author working in a
>           language that isn't ASCII-only pretend that CSS is
>           case-sensitive and use the same case for his identifiers
>           all over - CSS, JS... - and have that work?
> fantasai: if we normalize things somewhere for matching that requires
>           them to know details about our case transformation then I
>           think this is problematic, esp. for ascii-folding
> fantasai: Don't want authors to have to know that these letters in my
>           ident will be lowercased, but others won't

This is an HTML/DOM issue, not a CSS issue, and I don't think this is
the main issue at all, the main issue is whether there's a use case
for Unicode case insensitive matching of user-defined identifiers in
CSS and whether there's a precedent here for Unicode case insensitive
matching in other parts of the web platform.

The testing that I've done [1] makes me think that case insensitive
matching of non-ASCII text is for the most part *not* done today
(attribute value selector matching is an oddball exception).  And that
it's used at all is more a quirk of the HTML4 spec that doesn't exist
in the HTML5 spec, where case sensitive matching is specified except
for a limited set of existing cases where ASCII case insensitive
matching is used (e.g. HTML tagnames).

Given that we're discussing *user-defined* identifiers here, where an
author can use whatever casing style they prefer, I have trouble
understanding what the use case is for Unicode case insensitive
matching of these identifiers.  CSS has historically mimic'ed the case
sensitivity rules of the underlying language and since HTML5 is moving
towards case-sensitive matching, I think we should resolve to use case
sensitive matching of user identifiers in CSS.

If there are those who feel strongly that Unicode case insensitive
matching should be used for user identifiers, I think they need to lay
out the details of the exact matching algorithm used (e.g. C+F
mappings, no pre or post normalization) and the use case so that we
can analyze whether the details of the matching algorithm fit the use
case or not.

I think we should wrap this up and decide the issue once and for all
during the telcon next week.

Cheers,

John Daggett

[1] http://lists.w3.org/Archives/Public/www-style/2013Jan/0097.html

Received on Friday, 11 January 2013 02:40:38 UTC