Re: UTF-8 signature / BOM in CSS

François Yergeau wrote to <mailto:www-international@w3.org>, 
<mailto:w3c-css-wg@w3.org>, <mailto:w3c-i18n-ig@w3.org>, and 
<mailto:www-style@w3.org> on 6 December 2003 in "Re: UTF-8 signature / 
BOM in CSS" (<mid:3FD23453.6000009@yergeau.com>):

> [...] another way is to consider [the BOM] a character and to bring it 
> squarely in the grammar of a language, like I proposed recently for 
> CSS:
>
>  EncodingDecl = [BOM][@charset=<foobar>]
>
> with the additional constraint that EncodingDecl must occur at the 
> start of the stylesheet.

Is the BOM to be considered an identifier character? That's possible. 
Then an identifier consisting solely of one U+FEFF would be allowed at 
the beginning of a style sheet. But the codepoint U+FEFF could just as 
well be tokenized as its own type and grouped with "S" (space tokens) 
and comments as a separator of other tokens. This latter approach is 
not backwards compatible in a formal sense, but how many existing 
Cascading Style Sheets make use of U+FEFF in identifiers? About zero, 
I'd guess.

-- 
Etan Wexler.

Received on Saturday, 6 December 2003 22:07:34 UTC