Yet more tinkering of the ASCII-compatible definition. Also, discourage ISO-2022-* due to the potential for XSS. (whatwg r3335) from poot on 2009-06-29 (public-html-diffs@w3.org from June 2009)

From: poot <cvsmail@w3.org>
Date: Mon, 29 Jun 2009 09:36:53 +0900 (JST)
To: public-html-diffs@w3.org
Message-Id: <20090629003653.640482BBF7@toro.w3.mag.keio.ac.jp>
Yet more tinkering of the ASCII-compatible definition. Also, discourage
ISO-2022-* due to the potential for XSS. (whatwg r3335)

http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.2474&r2=1.2475&f=h
http://html5.org/tools/web-apps-tracker?from=3334&to=3335

===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.2474
retrieving revision 1.2475
diff -u -d -r1.2474 -r1.2475
--- Overview.html 28 Jun 2009 11:28:59 -0000 1.2474
+++ Overview.html 29 Jun 2009 00:35:26 -0000 1.2475
@@ -167,7 +167,7 @@
    <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
    <!--ZZZ:-->
    <!--<h2 class="no-num no-toc">W3C Working Draft 23 April 2009</h2>-->
-   <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 28 June 2009</h2>
+   <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 29 June 2009</h2>
    <!--:ZZZ-->
    <dl><!-- ZZZ: update the month/day (twice), (un)comment out
     <dt>This Version:</dt>
@@ -260,7 +260,7 @@
   track.
   <!--ZZZ:-->
   <!--This specification is the 23 April 2009 Working Draft.-->
-  This specification is the 28 June 2009 Editor's Draft.
+  This specification is the 29 June 2009 Editor's Draft.
   <!--:ZZZ-->
   </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>This specification is also being produced by the <a href="http://www.whatwg.org/">WHATWG</a>. The two specifications are
   identical from the table of contents onwards.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- context and rationale (required) --><p>This specification is intended to replace (be a new version of)
@@ -1535,17 +1535,18 @@
   interacting with external content intended for <a href="#plugin" title="plugin">plugins</a>. When third-party software is run with
   the same privileges as the user agent itself, vulnerabilities in the
   third-party software become as dangerous as those in the user
-  agent.<h4 id="character-encodings"><span class="secno">2.1.5 </span>Character encodings</h4><p>An <dfn id="ascii-compatible-character-encoding">ASCII-compatible character encoding</dfn> is one in which
-  bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F,
-  0x41 - 0x5A, and 0x61 - 0x7A<!-- is that list ok? do any character
-  sets we want to support do things outside that range?  -->, ignoring
-  bytes that are the second and later bytes of multibyte sequences,
-  map to the same Unicode characters as those bytes in ANSI_X3.4-1968
-  (US-ASCII). <a href="#references">[RFC1345]</a><p class="note">This includes such exotic encodings as Shift_JIS and
+  agent.<h4 id="character-encodings"><span class="secno">2.1.5 </span>Character encodings</h4><p>An <dfn id="ascii-compatible-character-encoding">ASCII-compatible character encoding</dfn> is a
+  single-byte or variable-length encoding in which the bytes 0x09,
+  0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A,
+  and 0x61 - 0x7A<!-- is that list ok? do any character sets we want
+  to support do things outside that range?  -->, ignoring bytes that
+  are the second and later bytes of multibyte sequences, all
+  correspond to single-byte sequences that map to the same Unicode
+  characters as those bytes in ANSI_X3.4-1968 (US-ASCII). <a href="#references">[RFC1345]</a><p class="note">This includes such exotic encodings as Shift_JIS and
   variants of ISO-2022, even though it is possible for bytes like 0x70
   to be part of longer sequences that are unrelated to their
   interpretation as ASCII. It excludes such encodings as UTF-7,
-  UTF-16, HZ-GB-2312, GSM03.38, and EBCDIC variants.</p><!--
+  UTF-8+names, UTF-16, HZ-GB-2312, GSM03.38, and EBCDIC variants.</p><!--
    We'll have to change that if anyone comes up with a way to have a
    document that is valid as two different encodings at once, with
    different <meta charset> elements applying in each case.
@@ -9232,10 +9233,24 @@
   <code><a href="#meta">meta</a></code> element in the <a href="#attr-meta-http-equiv-content-type" title="attr-meta-http-equiv-content-type">Encoding declaration
   state</a>, then the character encoding used must be an
   <a href="#ascii-compatible-character-encoding">ASCII-compatible character encoding</a>.<p>Authors should not use JIS-X-0208 <!-- x-JIS0208 -->
-  (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), and encodings based
-  on EBCDIC. Authors should not use UTF-32. Authors must not use the
-  CESU-8, UTF-7, BOCU-1 and SCSU encodings. <a href="#references">[RFC1345]</a><!-- for the JIS types --> <a href="#references">[UTF32]</a> <a href="#references">[CESU8]</a> <a href="#references">[UTF7]</a> <a href="#references">[BOCU1]</a> <a href="#references">[SCSU]</a></p><!-- no idea what to reference for
-  EBCDIC, so... --><p>Authors are encouraged to use UTF-8. Conformance checkers may
+  (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on
+  ISO-2022<!-- http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422
+  -->, and encodings based on EBCDIC. Authors should not use
+  UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU
+  encodings.
+  <a href="#references">[RFC1345]</a><!-- for the JIS types -->
+  <a href="#references">[RFC1468]</a><!-- ISO-2022-JP -->
+  <a href="#references">[RFC2237]</a><!-- ISO-2022-JP-1 -->
+  <a href="#references">[RFC1554]</a><!-- ISO-2022-JP-2 -->
+  <a href="#references">[RFC1922]</a><!-- ISO-2022-CN and ISO-2022-CN-EXT -->
+  <a href="#references">[RFC1557]</a><!-- ISO-2022-KR -->
+  <a href="#references">[UTF32]</a>
+  <a href="#references">[CESU8]</a>
+  <a href="#references">[UTF7]</a>
+  <a href="#references">[BOCU1]</a>
+  <a href="#references">[SCSU]</a>
+  <!-- no idea what to reference for EBCDIC, so... -->
+  <p>Authors are encouraged to use UTF-8. Conformance checkers may
   advise against authors using legacy encodings.<p>Authoring tools should default to using UTF-8 for newly-created
   documents.<p>In XHTML, the XML declaration should be used for inline character
   encoding information, if necessary.<h4 id="the-style-element"><span class="secno">4.2.6 </span>The <dfn><code>style</code></dfn> element</h4><dl class="element"><dt>Categories</dt>
Received on Monday, 29 June 2009 00:37:30 UTC