Re: UTF-8 signature / BOM in CSS

On Saturday 2003-12-06 19:08 -0800, Etan Wexler wrote:
> To: François Yergeau <francois@yergeau.com>,
>         Chris Lilley <chris@w3.org>, David Baron <dbaron@dbaron.org>,
>         www-international@w3.org, w3c-css-wg@w3.org, w3c-i18n-ig@w3.org,
>         www-style@w3.org

Any chance of trimming the list of recipients a bit?  Once the
w3c-css-wg message gets through the moderation (or whatever's holding it
up), I'll have recieved 4 copies of this message.

> François Yergeau wrote to <mailto:www-international@w3.org>, 
> <mailto:w3c-css-wg@w3.org>, <mailto:w3c-i18n-ig@w3.org>, and 
> <mailto:www-style@w3.org> on 6 December 2003 in "Re: UTF-8 signature / 
> BOM in CSS" (<mid:3FD23453.6000009@yergeau.com>):
> 
> >[...] another way is to consider [the BOM] a character and to bring it 
> >squarely in the grammar of a language, like I proposed recently for 
> >CSS:
> >
> > EncodingDecl = [BOM][@charset=<foobar>]
> >
> >with the additional constraint that EncodingDecl must occur at the 
> >start of the stylesheet.

I think the main advantage of such a change would be clarity.  (Or is
there some other advantage you were thinking of?)  I agree that it makes
it clearer that the BOM is allowed, but it might make it less clear that
the processing of the encoding declaration is an entirely separate
process from the tokenization and parsing of the stylesheet.  Then
again, the latter is probably the easier to emphasize in other ways.

> Is the BOM to be considered an identifier character? That's possible. 

I don't think it matters much either way, since the formal grammar would
not be too complicated either way.  It might be better to make the
definition of CSS identifiers use character classes as that of XML
identifiers does, but I think that's really an orthogonal question.

-David

-- 
L. David Baron                                <URL: http://dbaron.org/ >

Received on Saturday, 6 December 2003 22:40:29 UTC