- From: Chris Lilley <chris@w3.org>
- Date: Thu, 19 Feb 2004 00:29:09 +0100
- To: Boris Zbarsky <bzbarsky@MIT.EDU>
- Cc: Bert Bos <bert@w3.org>, Tex Texin <tex@i18nguy.com>, www-style@w3.org
On Thursday, February 19, 2004, 12:17:56 AM, Boris wrote: >> Because most stylesheets out there are in what? Most are in US-ASCII, >> I would guess, since the entire syntax of CSS uses US-ASCII. The only >> opportunities to have anything else are replaced content in:before and >> :after, which is not too common in practice since it doesn't work in >> MSIE/Win. BZ> You forgot these niggling things little developers tend to put in code BZ> (including stylesheets) to make it comprehensible -- comments. Lots and lots BZ> of sheets have comments. Copyright notices, especially. With people's names. BZ> Which tend to NOT be ascii, often enough, except in the US. Good point. I had not considered comments. On the other hand BZ> In the wild, most stylesheets that are not associated with US websites are BZ> either ISO-8859-1 or Shift_JIS, from what I've seen. I would be hard pressed BZ> to estimate relative frequencess of those two as compared to us-ascii. Figures would be handy, but the point is well made. >> So, if most stylesheets are US-ASCII then a default of UTF-8 would >> work pretty well. BZ> Yeah, as long as you stick to US sites.... No, I was not making that assumption, nor would i consider that limitation to be at all suitable. BZ> since treating ISO-8859-1 or BZ> Shift_JIS as UTF-8 will at best lead to recoverable decoding errors (and at BZ> worst to irrecoverable ones, depending on what your decoder looks like). Note BZ> that attempting recovery from decoding errors has security issues, so I can BZ> perfectly well understand people not trying to do that. Could such security issues not be triggered by taking such a stylesheet and referencing it from a page with a suitable encoding that would, if applied to the stylesheet, trigger the error? To clarify; the situation I would like to see is that all stylesheets declare what encoding they are in, preferably using an @charset rule so that authoring tools, which know this information, can reliably pass on this info in the stylesheets they write. If there are multiple sources of information, then they should all say the same thing. The limitations of text/* media types mean that application/css would be a better bet in terms of consistent decoding without guesswork. -- Chris Lilley mailto:chris@w3.org Chair, W3C SVG Working Group Member, W3C Technical Architecture Group
Received on Wednesday, 18 February 2004 18:29:11 UTC