W3C home > Mailing lists > Public > www-style@w3.org > July 2004

Re: Handling unrecognized or unsupported charset

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Thu, 15 Jul 2004 13:45:53 -0500
Message-ID: <40F6D0E1.4070302@mit.edu>
To: Mark Moore <mark.moore@notlimited.com>
Cc: www-style@w3.org

Mark Moore wrote:
> A UA that doesn't understand the Greek charset (ISO-8859-7) will find the
> style sheet perfectly syntactically correct.  It will be able to parse the
> sheet

No, it will not.  It will not even be able to tokenize the sheet.  The step 
right before tokenization is to convert the sheet to Unicode and then work with 
the Unicode character stream, not the byte stream.  If the conversion to Unicode 
cannot be performed, tokenization cannot even start.

The only way to attempt to deal short of discarding the sheet is to assume some 
other charset and use that.  Say take the charset from the next step of the 
charset selection algorithm.

> In this case, the @charset rule should be considered invalid, and the UA
> should continue parsing immediately after the terminating semicolon (or
> block) as described in section 4.1.5. [2]

This is not specified anywhere in the spec.  Are you suggesting that it be 
specified?

-Boris
Received on Thursday, 15 July 2004 14:46:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:54:31 GMT