- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Fri, 20 Feb 2004 14:57:41 -0500
- To: Mikko Rantalainen <mira@cc.jyu.fi>
- Cc: WWW Style <www-style@w3.org>
> However, it might make some sense to interpret invalid byte > sequences in UTF-8 file as invalid characters and proceed instead of > halting There are some serious security implications involved in doing this, so I can understand UAs being unwilling to do it.... If done, it needs to be done _very_ carefully. > I REALLY think we shouldn't specify new stupid rules simply because > there already exist some stupid tutorials. Fix the stupid tutorials > and specify good rules. The tutorials are not the problem, really. The massive installed base of stylesheets that would break if we suddenly started treating them as UTF-8 is a problem. Unfortunately, quirks mode does not help much, since a number of pages out there are in standards mode in modern.... and do not have sheets properly labeled. A good fraction of these are not in UTF-8. Put another way, how many sheets out there _do_ have the charset labeled? Other than testcases written to test algorithms such as the ones discussed here, the only ones I've seen have been those written by people _very_ knowledgeable about CSS (people who regularly post on this list). And even then, they tend to be unlabeled [1]. In other words, the first UA to stop guessing would suddenly fail to render every single site that triggered standards mode. Boris [1] A brief survey of randomly selected sites that would have charsets set properly if any would: 1) http://ln.hixie.ch/ -- no HTTP header, no @charset rule, no attribute on linking element 2) http://dbaron.org/log/ -- charset set in HTTP header. Bravo! 3) http://bjoern.hoehrmann.de/ -- links to http://www.w3.org/StyleSheets/Core/Swiss which has no HTTP header and no @charset rule. Doesn't have a charset attribute on the linking element. 4) http://tantek.com/log/ -- no HTTP header, no @charset rule, no attribute on linking element 5) http://weblogs.mozillazine.org/bz/ -- no HTTP header, no @charset rule, no attribute on linking element http://web.mit.edu/bzbarsky/www/ -- same (OK, so using myself as an example is a little questionable). 6) http://www.w3.org/ -- no HTTP header, no @charset rule, no attribute on linking element 7) http://www.cc.jyu.fi/~mira/ -- no HTTP header, no @charset rule, no attribute on linking element Needless to say, every single one of these pages triggers standards mode (in Mozilla, at least, but I suspect they would in most browsers). The only one that would use the right charset is David's. Now I did not look at the actual content of the sheets, so it may be possible they are all ASCII and hence would actually work as UTF-8.... -- We are all agreed that your theory is crazy. The question which divides us is whether it is crazy enough to have a chance of being correct. My own feeling is that it is not crazy enough. -- Niels Bohr
Received on Friday, 20 February 2004 14:57:47 UTC