W3C home > Mailing lists > Public > www-style@w3.org > November 2007

Re: [CSS21] Case-insensitivity not defined

From: Mark Davis <mark.davis@icu-project.org>
Date: Thu, 15 Nov 2007 16:01:59 -0800
Message-ID: <30b660a20711151601x190a8d1cl9170d9acb17d0bfe@mail.gmail.com>
To: fantasai <fantasai.lists@inkedblade.net>
Cc: www-international@w3.org, www-style@w3.org
If CSS identifiers do not exclude compatibility characters (such as Kelvin
and the ff ligature), then lowercasing is the *least* of the security issues
to worry about.

Cf.

UAX 31 Identifier and Pattern Syntax <http://www.unicode.org/reports/tr31/>
*See also Proposed Update: Unicode Identifier and Pattern
Syntax<http://www.unicode.org/reports/tr31/tr31-8.html>
*
UTR 36 Unicode Security Considerations<http://www.unicode.org/reports/tr36/>
*See also Proposed Update <http://www.unicode.org/reports/tr36/tr36-6.html>*
Mark

On Nov 15, 2007 3:45 PM, fantasai <fantasai.lists@inkedblade.net> wrote:

>
> Addison Phillips wrote:
> >
> >> I find that the basic Latin letters do match each other and nothing
> >> else, if you ignore the language-specific foldings, with one exception.
> >> U+212A KELVIN SIGN, which looks exactly like "K" and shouldn't exist
> >> anyhow (it's compatibility equivalent to a proper "K") is case-folded
> >> to "k".  I consider that to come under the heading of the Right Thing.
> >
> > Compatibility characters always present a problem of this sort. I think
> > this is also the Right Thing.
> >
> >> It's also true that some ligatures are case-folded to their spelled out
> >> equivalents:  for example, U+FB00 LATIN SMALL LIGATURE FF is
> case-folded
> >> to simple "ff".
> >
> > This is actually a Good Thing too.
>
> It's a Good Thing for natural-language matching and search results. It is
> imho not a Good Thing for defining case-insensitivity for keywords in a
> computer language. Since CSS keywords are all limited to the ASCII range,
> it should be possible to reliably match against CSS keywords with only
> ASCII case-insensitivity. Throwing in random other characters into the mix
> can cause confusion and possibly also result in security holes. I believe
> the potential problems in that respect outweigh the convenience of
> case-insensitivity for non-Latin user-defined identifiers.
>
> ~fantasai
>
>


-- 
Mark
Received on Friday, 16 November 2007 00:02:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:54:56 GMT