Re: Two new encoding related articles for review

On Fri, Mar 14, 2014 at 10:42 PM, Glenn Adams <glenn@skynav.com> wrote:
> On Fri, Mar 14, 2014 at 10:13 AM, Zack Weinberg <zackw@panix.com> wrote:
...
>> Furthermore, UTF-32, UTF-16, JIS_C6226-1983, JIS_X0212-1990,
>> HZ-GB-2312, JOHAB (Windows code page 1361), CESU-8, UTF-7, BOCU-1,
>> SCSU, ISO-2022 (all varieties), and EBCDIC (all varieties) MUST NOT be
>> used.  These encodings are *ASCII-incompatible* -- that is, in these
>> encodings, octets with values 00 through 7F (hexadecimal) are not
>> always interpreted as Unicode code points U+0000 through U+007F.  This
>> has historically been a source of security vulnerabilities.
>
> It seems strange for a guideline to say "MUST NOT". I would suggest SHOULD
> NOT is more appropriate. In any case, we shouldn't be in the business of
> telling content authors what they can or can't do. If they want to use an
> encoding that isn't well supported, then the risk is theirs.

You can tell I'm used to writing normative specs, huh?  How's this instead?

"UTF-32, UTF-16, (etcetera) are especially unlikely to work: HTML5 and
the Encoding Standard forbid Web clients from accepting most of them.
(These encodings are *ASCII-incompatible* -- octets with values 00
through 7F (hexadecimal) do not always encode U+0000 through U+007F --
which has historically been a source of security vulnerabilities.)"

Received on Saturday, 15 March 2014 03:16:02 UTC