- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Fri, 23 Oct 2009 03:00:32 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv7773 Modified Files: Overview.html Log Message: discourage use of HZ-GB-2312; explain why. (whatwg r4282) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.3422 retrieving revision 1.3423 diff -u -d -r1.3422 -r1.3423 --- Overview.html 23 Oct 2009 02:21:19 -0000 1.3422 +++ Overview.html 23 Oct 2009 03:00:29 -0000 1.3423 @@ -10425,12 +10425,13 @@ <a href="#attr-meta-http-equiv-content-type" title="attr-meta-http-equiv-content-type">Encoding declaration state</a>, then the character encoding used must be an <a href="#ascii-compatible-character-encoding">ASCII-compatible character encoding</a>.<p>Authors should not use JIS-X-0208 <!-- x-JIS0208 --> - (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on - ISO-2022<!-- http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422 - -->, and encodings based on EBCDIC. Authors should not use - UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU - encodings. + (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), HZ-GB-2312<!-- has + crazy handling of ASCII "~" -->, encodings based on ISO-2022<!-- + http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422 -->, and + encodings based on EBCDIC. Authors should not use UTF-32. + Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU encodings. <a href="#refsRFC1345">[RFC1345]</a><!-- for the JIS types --> + <a href="#refsRFC1842">[RFC1842]</a><!-- HZ-GB-2312 --> <a href="#refsRFC1468">[RFC1468]</a><!-- ISO-2022-JP --> <a href="#refsRFC2237">[RFC2237]</a><!-- ISO-2022-JP-1 --> <a href="#refsRFC1554">[RFC1554]</a><!-- ISO-2022-JP-2 --> @@ -10442,8 +10443,16 @@ <a href="#refsBOCU1">[BOCU1]</a> <a href="#refsSCSU">[SCSU]</a> <!-- no idea what to reference for EBCDIC, so... --> - <p>Authors are encouraged to use UTF-8. Conformance checkers may - advise against authors using legacy encodings.<div class="impl"> + <p class="note">Most of these encodings are discouraged because of + security concerns. If a hostile user can contribute text to a site + using these encodings, bugs in the site's whitelisting filter or in + a user agent can easily lead to the filter interpreting the + contribution as "safe" while the user agent interprets the same + contribution as containing a <code><a href="#script">script</a></code> element. This would + enable cross-site scripting attacks. By avoiding these encodings, + and always providing a <a href="#character-encoding-declaration">character encoding declaration</a>, + an author is less likely to run into this kind of problem.<p>Authors are encouraged to use UTF-8. Conformance checkers may + advise authors against using legacy encodings.<div class="impl"> <p>Authoring tools should default to using UTF-8 for newly-created documents.</p> @@ -71071,6 +71080,13 @@ Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF, December 1993.</dd> + <dt id="refsRFC1842">[RFC1842]</dt> + + <dd><cite><a href="http://www.ietf.org/rfc/rfc1842.txt">ASCII + Printable Characters-Based Chinese Character Encoding for Internet + Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang. + IETF, August 1995.</dd> + <dt id="refsRFC1922">[RFC1922]</dt> <dd><cite><a href="http://www.ietf.org/rfc/rfc1922.txt">Chinese Character Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao,
Received on Friday, 23 October 2009 03:00:35 UTC