- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Fri, 16 Nov 2007 14:09:51 +0900
- To: fantasai <fantasai.lists@inkedblade.net>, www-international@w3.org
- Cc: www-style@w3.org, "'WWW International'" <www-international@w3.org>
I tend to go with what Fantasai says below, and what Anne also seems to have expressed: The case sensitivity needs in CSS are very limited. As far as I understand, we have three cases: - CSS keywords: These are US-ASCII only, and therefore the simplest case sensitivity is okay. Actually, 99% or more of the stylesheets that I have seen use only lower case, so it wouldn't have been a great problem if these had been defined case-sensitive originally, but of course it's impossible to go back and change things now. - Identifiers in the markup languages that CSS works with (e.g. HTML and XML element and attribute names): Here CSS says that case sensitivity depends on the language involved. For traditional HTML, element names are case- insensitive, therefore CSS treats them as case-insensitive. For XML, that's the other way round. The only thing that CSS can do here reasonably is to follow whatever the target language specifies, both for the basic question of case-sensitive or not as well as for the details regarding non-ASCII characters, if applicable. I guess it would be good that the CSS spec explicitly points out that such details may vary depending on the target language. Note that the question of case sensitivity isn't simply a per-language thing; it's easily possible that there are variations within a language. CSS2 already describes this at http://www.w3.org/TR/REC-CSS2/syndata.html#q4, explaining that ids, classes, and font names are case- sensitive even in traditional HTML. - Identifiers within CSS. These include cases such as namespace prefixes and counter names inside CSS. Ideally, these should just work case-sensitive; I don't think it's asking too much from stylesheet writers to use the same case for all occurrences of a specific counter name. If that's not possible for legacy reasons (e.g. stylesheets that indeed use counter names and friends with haphazard casing), then something like 'case-insensitive for US-ASCII, case sensitive for the rest', even though it sounds terribly ugly, may be the best solution. Regards, Martin. At 08:45 07/11/16, fantasai wrote: > >Addison Phillips wrote: >> >>> I find that the basic Latin letters do match each other and nothing >>> else, if you ignore the language-specific foldings, with one exception. >>> U+212A KELVIN SIGN, which looks exactly like "K" and shouldn't exist >>> anyhow (it's compatibility equivalent to a proper "K") is case-folded >>> to "k". I consider that to come under the heading of the Right Thing. >> Compatibility characters always present a problem of this sort. I think this is also the Right Thing. Compatibility characters should not be honored by trying to match them to others. The best thing here is to isolate and quarantine them so that they die out :-(. >>> It's also true that some ligatures are case-folded to their spelled out >>> equivalents: for example, U+FB00 LATIN SMALL LIGATURE FF is case-folded >>> to simple "ff". >> This is actually a Good Thing too. No, for CSS it definitely would be overkill. >It's a Good Thing for natural-language matching and search results. It is >imho not a Good Thing for defining case-insensitivity for keywords in a >computer language. Since CSS keywords are all limited to the ASCII range, >it should be possible to reliably match against CSS keywords with only >ASCII case-insensitivity. Throwing in random other characters into the mix >can cause confusion and possibly also result in security holes. I believe >the potential problems in that respect outweigh the convenience of >case-insensitivity for non-Latin user-defined identifiers. Two little remarks here: - There are not too many non-Latin scripts that have cases. These are usually simpler than Latin itself, because they don't have issues such as the Turkish/Azery I/i. So this is a non-ASCII, but very much Latin script, issue. - Case insensitivity is a user convenience mostly in cases where case conventions are not well established, and where users are often guessing identifiers, or have to remember them for repeated use. The examples we are really dealing with, such as counter names, are very local, and aren't used on a regular basis by plain end users. For such cases, the 'convenience' issue is of much lower importance. Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 16 November 2007 05:39:50 UTC