Re: [CSS21] response to issue 115 (and 44)

On Feb 21, 2004, at 00:26, Bert Bos wrote:

>  4) If all else fails, assume UTF-8.

Why not windows-1252 (with the few undefined bytes mapped to 
*something* so that all byte streams can be converted some 
"characters")?

 From a pragmatic point of view it would make sense to acknowledge that 
people how aren't clued about character encodings are more likely to 
serve style sheets that work if treated as windows-1252 than to serve 
UTF-8. Also, for HTML browsers tend to default to windows-1252 
regardless of the specs.

This approach would allow non-ASCII comments to be safely ignored--and 
usually the non-ASCII characters in style sheets occur in comments, 
because generated content isn't used much and few people are 
adventurous enough to use non-ASCII for identifiers. Anyway, it's just 
plain stupid to use non-ASCII outside comments in a style sheet that 
doesn't have a character encoding label and doesn't have a BOM, so in 
the relatively rare cases where this heuristic fails, the author would 
have only him/herself to blame.

Using this heuristic also in case 3 instead of looking at the linking 
document would improve the cacheability of parsed style sheets with 
negligible actual breakage.

-- 
Henri Sivonen
hsivonen@iki.fi
http://iki.fi/hsivonen/

Received on Monday, 23 February 2004 08:59:51 UTC