Re: Limiting the size of the @charset byte sequence from Simon Sapin on 2014-01-28 (www-international@w3.org from January to March 2014)

From: Simon Sapin <simon.sapin@exyr.org>
Date: Tue, 28 Jan 2014 11:26:38 -0800
To: Bjoern Hoehrmann <derhoermi@gmx.net>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
CC: Henri Sivonen <hsivonen@hsivonen.fi>, www-style list <www-style@w3.org>, www International <www-international@w3.org>
Message-ID: <52E8046E.7040408@exyr.org>

On 28/01/2014 10:31, Bjoern Hoehrmann wrote:
> Arbitrary limits are bad design and often harder to implement correctly
> than something without arbitrary limits. The obvious parsing device for
> something like the `@charset` rule is a DFA, and a DFA that halts after
> N input bytes is a lot more complex than one that does not.

I disagree.

A DFA may be the first theoretical construct that comes to mind, and 
limiting the input length may difficult to express strictly as a DFA. 
But there is no such constraint when writing actual code.

Python code, limiting the length explicitly:

https://github.com/SimonSapin/tinycss2/blob/41808c78ccee52c373db941067744c0d9fc4f0bb/tinycss2/bytes.py#L34

C++ code, not overflowing the given buffer whose size is limited by the 
caller:

https://github.com/mozilla/gecko-dev/blob/7cdb98db06a0079793327801d91d0f5fd6697024/layout/style/Loader.cpp#L615

-- 
Simon Sapin

Received on Tuesday, 28 January 2014 19:27:14 UTC