- From: Zack Weinberg <zackw@panix.com>
- Date: Fri, 14 Mar 2014 23:15:36 -0400
- To: Glenn Adams <glenn@skynav.com>
- Cc: Richard Ishida <ishida@w3.org>, www International <www-international@w3.org>, W3C Style <www-style@w3.org>, "HTML WG (public-html@w3.org)" <public-html@w3.org>
On Fri, Mar 14, 2014 at 10:42 PM, Glenn Adams <glenn@skynav.com> wrote: > On Fri, Mar 14, 2014 at 10:13 AM, Zack Weinberg <zackw@panix.com> wrote: ... >> Furthermore, UTF-32, UTF-16, JIS_C6226-1983, JIS_X0212-1990, >> HZ-GB-2312, JOHAB (Windows code page 1361), CESU-8, UTF-7, BOCU-1, >> SCSU, ISO-2022 (all varieties), and EBCDIC (all varieties) MUST NOT be >> used. These encodings are *ASCII-incompatible* -- that is, in these >> encodings, octets with values 00 through 7F (hexadecimal) are not >> always interpreted as Unicode code points U+0000 through U+007F. This >> has historically been a source of security vulnerabilities. > > It seems strange for a guideline to say "MUST NOT". I would suggest SHOULD > NOT is more appropriate. In any case, we shouldn't be in the business of > telling content authors what they can or can't do. If they want to use an > encoding that isn't well supported, then the risk is theirs. You can tell I'm used to writing normative specs, huh? How's this instead? "UTF-32, UTF-16, (etcetera) are especially unlikely to work: HTML5 and the Encoding Standard forbid Web clients from accepting most of them. (These encodings are *ASCII-incompatible* -- octets with values 00 through 7F (hexadecimal) do not always encode U+0000 through U+007F -- which has historically been a source of security vulnerabilities.)"
Received on Saturday, 15 March 2014 03:16:05 UTC