- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Fri, 20 Feb 2004 14:57:41 -0500
- To: Mikko Rantalainen <mira@cc.jyu.fi>
- Cc: WWW Style <www-style@w3.org>
> However, it might make some sense to interpret invalid byte
> sequences in UTF-8 file as invalid characters and proceed instead of
> halting
There are some serious security implications involved in doing this, so I can
understand UAs being unwilling to do it.... If done, it needs to be done
_very_ carefully.
> I REALLY think we shouldn't specify new stupid rules simply because
> there already exist some stupid tutorials. Fix the stupid tutorials
> and specify good rules.
The tutorials are not the problem, really. The massive installed base of
stylesheets that would break if we suddenly started treating them as UTF-8 is a
problem. Unfortunately, quirks mode does not help much, since a number of
pages out there are in standards mode in modern.... and do not have sheets
properly labeled. A good fraction of these are not in UTF-8.
Put another way, how many sheets out there _do_ have the charset labeled?
Other than testcases written to test algorithms such as the ones discussed
here, the only ones I've seen have been those written by people _very_
knowledgeable about CSS (people who regularly post on this list). And even
then, they tend to be unlabeled [1].
In other words, the first UA to stop guessing would suddenly fail to render
every single site that triggered standards mode.
Boris
[1] A brief survey of randomly selected sites that would have charsets set
properly if any would:
1) http://ln.hixie.ch/ -- no HTTP header, no @charset rule, no attribute on
linking element
2) http://dbaron.org/log/ -- charset set in HTTP header. Bravo!
3) http://bjoern.hoehrmann.de/ -- links to
http://www.w3.org/StyleSheets/Core/Swiss which has no HTTP header and no
@charset rule. Doesn't have a charset attribute on the linking element.
4) http://tantek.com/log/ -- no HTTP header, no @charset rule, no attribute on
linking element
5) http://weblogs.mozillazine.org/bz/ -- no HTTP header, no @charset rule, no
attribute on linking element
http://web.mit.edu/bzbarsky/www/ -- same
(OK, so using myself as an example is a little questionable).
6) http://www.w3.org/ -- no HTTP header, no @charset rule, no
attribute on linking element
7) http://www.cc.jyu.fi/~mira/ -- no HTTP header, no @charset rule, no
attribute on linking element
Needless to say, every single one of these pages triggers standards mode (in
Mozilla, at least, but I suspect they would in most browsers). The only one
that would use the right charset is David's.
Now I did not look at the actual content of the sheets, so it may be possible
they are all ASCII and hence would actually work as UTF-8....
--
We are all agreed that your theory is crazy. The
question which divides us is whether it is crazy enough
to have a chance of being correct. My own feeling is
that it is not crazy enough.
-- Niels Bohr
Received on Friday, 20 February 2004 14:57:47 UTC