W3C home > Mailing lists > Public > www-style@w3.org > November 2007

Re: [CSS21] Case-insensitivity not defined

From: Addison Phillips <addison@yahoo-inc.com>
Date: Sun, 18 Nov 2007 11:59:10 -0800
Message-ID: <4740998E.5080509@yahoo-inc.com>
To: Martin Duerst <duerst@it.aoyama.ac.jp>
CC: fantasai <fantasai.lists@inkedblade.net>, www-international@w3.org, www-style@w3.org

Martin Duerst wrote:
> I tend to go with what Fantasai says below, and what Anne also
> seems to have expressed: The case sensitivity needs in CSS are
> very limited. As far as I understand, we have three cases:

I agree. But if we describe it at all, we should describe it correctly. 
Either it is a special, ASCII-only, case or, if we allow non-ASCII into 
the tent, we should describe it correctly. CSS and HTML currently don't 
define "case-insensitivity" other than by example. You have to read the 
documents extremely closely to ascertain if all cases where 
case-insensitivity is applied are limited to ASCII. This should not be 
the case. The CI definitions should spell this out. Otherwise 
implementers might inadvertently call locale-sensitive "strcasecmp" type 
functions and not understand why their code fails in certain locales.

> 
> - CSS keywords: These are US-ASCII only, and therefore the
>   simplest case sensitivity is okay. 

I agree with this.

> 
> - Identifiers in the markup languages that CSS works with
>   (e.g. HTML and XML element and attribute names): Here
>   CSS says that case sensitivity depends on the language
>   involved. For traditional HTML, element names are case-
>   insensitive, therefore CSS treats them as case-insensitive.

Yes, but the element names are pre-defined. As noted, the 
case-insensitive attributes do not include 'class', 'id' and so forth.

>   For XML, that's the other way round. The only thing that
>   CSS can do here reasonably is to follow whatever the
>   target language specifies, both for the basic question of
>   case-sensitive or not as well as for the details regarding
>   non-ASCII characters, if applicable. 

Unfortunately, HTML's definition of case-insensitive---the entire 
definition---is:

--
The value is case-insensitive (i.e., user agents interpret "a" and "A" 
as the same).
--

(see: http://www.w3.org/TR/html4/types.html#h-6.1)

... which is insufficient to know what qualifies as a conforming 
implementation.

> 
> - Identifiers within CSS. These include cases such as
>   namespace prefixes and counter names inside CSS.
>   Ideally, these should just work case-sensitive; I don't
>   think it's asking too much from stylesheet writers to
>   use the same case for all occurrences of a specific
>   counter name. If that's not possible for legacy reasons
>   (e.g. stylesheets that indeed use counter names and
>   friends with haphazard casing), then something like
>   'case-insensitive for US-ASCII, case sensitive for
>   the rest', even though it sounds terribly ugly, may
>   be the best solution.

I agree. But since CSS currently does NOT make these case-sensitive, we 
need to specify what does happen.

The problem here is that many implementers are likely to call extant 
case-insensitive string comparison functions (or perform a 
locale-sensitive operation, such as tolower) rather than implementing a 
specific comparison or taking care to avoid the locale problem.

So I guess, in summary, what I'm suggesting is either:

1. Change case sensitivity to remove the need for any non-ASCII case 
insensitive comparisons and then specify ASCII case insensitivity for 
CSS keywords and the like.

2. Change the case sensitivity to reference Unicode case folding 
(section 3.13, IIRC).

If we specify ASCII-only case-insensitivity, it should be abundantly 
clear in the text that this is not an internationalization oversight but 
a deliberate design decision.


> - Case insensitivity is a user convenience mostly in cases where
>   case conventions are not well established, and where users are
>   often guessing identifiers, or have to remember them for repeated
>   use. The examples we are really dealing with, such as counter
>   names, are very local, and aren't used on a regular basis by
>   plain end users. For such cases, the 'convenience' issue is of
>   much lower importance.

Attribute values were my main concern. But these turn out mostly to be a 
non-issue.

Regards,

Addison

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.
Received on Sunday, 18 November 2007 19:59:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:54:56 GMT