# Re: CSS validation of double semicolons

From: Sampo Syreeni <decoy@iki.fi>
Date: Mon, 18 Jul 2022 13:35:16 +0300 (EEST)
To: David Hill <DHill@StudentLoan.org>

Message-ID: <alpine.DEB.2.21.2207181244590.14778@lakka.kapsi.fi>
On 2022-07-15, David Hill wrote:

> Some build tools will fail to process css with double semicolons such as this
>
> .foo {
>     margin-top: 20px;;
> }

Those tools are then stupid. They're by the book.
https://www.w3.org/TR/CSS21/grammar.html

However, people do build their parsers rather into the book, and then
into the other book. Or codebase. They tend to factor their parsers into
Bison and Flex, or whatnot.

One of the things which cannot be handled right here is \epsilon. That
thingy which takes just space, between them semi-colons. It's a
grammatical null, and proructive at that. It's the thing happening
besides the two semicolons, here.

Were it to have a semantic meaning...heaven sakes please do not...it
could be parsed as LALR(1) in text. And it could be parsed well enough
in parsing expressions, efficiently, no matter how many white spaces
there were between.

> On several occasions over the years all local tests pass and we deploy
> to QA to find that the minified CSS is missing.

So do develop a formal grammar of CSS. A parser for it. I for one once
did that towards PICS, and found a *singular* mistake in the vocabulary.

> We plug the CSS into the validator and it comes back all green, no
> warnings, no errors, perfect CSS, so we go looking for other issues
> and it takes much longer than it might to find that the CSS is in fact
> the issue.

language/CSS.

Because once you do that, your formal language *does* bite you in the
ass where it should. And I'll tell you how it does.

Once, when I really did formalize PICS as a format, I bumped into the
fact that the language wasn't parsable as LARL(n).. Seriously, not even
as LARL(1) as C or most languages are, but not as even LARL(n). And that
was because of *one* *singular* *stupid* mistake, in the *most*
vestigial part of the language. Basically, in its exception; as a data
language. Some piece of shit nobody had put in *just* the rigth
productions so as to make the language LARL(18) instead of LALR(1). In
one simple, stupid production.

How did I come privy to that? Well, because I had to code over the Bison
warnings, one at a time. Eventually I had to goddamn-fucken put a lot of
state back into the stack, simply because of backtracking. The stack
didn't exactly help me do what I wanted to do. If it wasn't a LARL(1)
stack, but a true Earley or CYK-stack, why not. But since it never is;;
fuck this shit.