RE: [CSS21] BOM & @charset (issues 44 & 115)

> [Original Message]
> From: Bert Bos <bert@w3.org>
>
> As I wrote in the response to issues 115 and 44, CSS 2.1 will allow a
> BOM (Byte Order Mark) to occur in an external style sheet and UAs can
> use it (1) to determine the byte ordering if they know the encoding but
> not the byte order, and (2) determine the encoding itself, if there is
> no more authoritative source for it (in particular HTTP headers).
>
> So the list of places to look for encoding info was as follows:
>
>    1. HTTP header
>    2. @charset
>    3. BOM
>    4. etc.
>
> But some people pointed out that the BOM, if present, comes before the
> @charset, so in fact you always have to check it first. It seems
> therefore, that the order of (2) and (3) in the list doesn't matter.
> And thus, we want to change it to:
>
>    1. HTTP header
>    2. BOM
>    3. @charset
>    4. etc.
>
> But this is complicated material, so: does anybody see a problem with
> this? There doesn't seem to be an encoding in which the "@" looks like
> the BOM of some other encoding. Did we overlook anything?

What about CESU-8 (from UTF#26)?  It shares the same BOM as UTF-8,
so only the HTTP header or the @charset rule can distinguish them.
(UTF#26 explicitly bars attempting to determine that the encoding is
CESU-8 by auto detection.)

Received on Tuesday, 17 February 2004 21:41:26 UTC