W3C home > Mailing lists > Public > www-international@w3.org > January to March 2014

Re: [css-syntax] ISSUE-329: @charset has no effect on stylesheet??

From: Henri Sivonen <hsivonen@hsivonen.fi>
Date: Mon, 27 Jan 2014 10:20:35 +0200
Message-ID: <CANXqsR+0xdXT7GGsRT-1eqtB-PQNazCBRPqmjfGQYdX=Dm0EyA@mail.gmail.com>
To: "Phillips, Addison" <addison@lab126.com>
Cc: Anne van Kesteren <annevk@annevk.nl>, Richard Ishida <ishida@w3.org>, "Tab Atkins Jr." <jackalmage@gmail.com>, Zack Weinberg <zackw@panix.com>, www-style list <www-style@w3.org>, www International <www-international@w3.org>
On Wed, Jan 22, 2014 at 10:25 PM, Phillips, Addison <addison@lab126.com> wrote:
> The issue here is that CSS's normal syntax allows different whitespace and quote
> formation within the actual CSS of the file.

Note that the byte-level discovery of the <meta>-based internal
encoding declaration in the HTML parser doesn't obey the same rules as
the main character-level parsing phase, either.

> Since the byte munging involved is not remarkably difficult to describe or
> implement and since it will improve the likelihood that people "get it right"
> (let alone not breaking existing stylesheets that somehow get it wrong), why
> not specify @charset consistently with the rest of CSS? I'm fine with saying
> it has to come first, etc. for the reasons you cite.

It's a terribly bad idea to define an internal character encoding
declaration syntax in such a way that the syntax definition doesn't
guarantee the syntax to fit within a string of bytes shorter than N
bytes with a small value for N. For this reason, it's a bad idea to
allow an  arbitrary number of whitespace characters between '@chaset'
and the quote. Unfortunately, CSS still fails at making the length of
the declaration bounded, because "get an encoding" trims white space.
Gecko imposes a bound on the length anyway.

(Yes, the previous paragraph means that the HTML and XML internal
encoding declaration syntaxes are terribly bad ideas, too.)

Anyway, at this point, absent data indicating that compatibility with
existing content would improve, we shouldn't change this aspect of CSS
for theoretical purity.

On Thu, Jan 23, 2014 at 8:18 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote:
> There is no reason to create a new stylesheet in any encoding other
> than utf-8.  We need to get out of the trap of thinking that encodings
> are in any way valuable.  They're a legacy pain, and we've fixed the
> situation in practice by standardizing on a single encoding.

YES! I really hope the i18n group keeps this in mind when writing its guides.

> 1. Nobody should be using @charset in the first place. We only retain
> it for legacy purposes, and new stylesheets should just be done in
> utf-8.
> 2. There is a realistic concern that we're already under legacy
> constraints to not loosen the syntax.
> 3. CSS parsing allows for *far* more variation than just "more spaces
> and either type of quote".
> 4. UAs are very unlikely to implement the full flexibility of CSS
> parsing just for encoding detection.
> 5. If we specify only a subset of allowed variation, the original goal
> of making encoding detection aligned with valid @charset rules is
> still not satisfied.
>
> For all these reasons, I strongly reject any proposal to change the
> current specification regarding the strictness of the encoding
> declaration syntax.

Agreed.

P.S. At least we fixed
http://w3cmemes.tumblr.com/post/35332222321/css-2-1-syndata-is-awesome
. Where were i18n comments when *that* gem was the spec?

-- 
Henri Sivonen
hsivonen@hsivonen.fi
https://hsivonen.fi/
Received on Monday, 27 January 2014 08:21:06 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:36 UTC