Re: Handling unrecognized or unsupported charset

Mark Moore wrote:
> A UA that doesn't understand the Greek charset (ISO-8859-7) will find the
> style sheet perfectly syntactically correct.  It will be able to parse the
> sheet

No, it will not.  It will not even be able to tokenize the sheet.  The step 
right before tokenization is to convert the sheet to Unicode and then work with 
the Unicode character stream, not the byte stream.  If the conversion to Unicode 
cannot be performed, tokenization cannot even start.

The only way to attempt to deal short of discarding the sheet is to assume some 
other charset and use that.  Say take the charset from the next step of the 
charset selection algorithm.

> In this case, the @charset rule should be considered invalid, and the UA
> should continue parsing immediately after the terminating semicolon (or
> block) as described in section 4.1.5. [2]

This is not specified anywhere in the spec.  Are you suggesting that it be 
specified?

-Boris

Received on Thursday, 15 July 2004 14:46:21 UTC