html5/pubnotes Overview.html,1.211,1.212 Overview.src.html,1.205,1.206 from Michael Smith via cvs-syncmail on 2008-05-26 (public-html-commits@w3.org from May 2008)

From: Michael Smith via cvs-syncmail <cvsmail@w3.org>
Date: Mon, 26 May 2008 07:04:41 +0000
To: public-html-commits@w3.org
Message-Id: <E1K0Wl7-0000X4-16@lionel-hutz.w3.org>
Update of /sources/public/html5/pubnotes
In directory hutz:/tmp/cvs-serv1753

Modified Files:
	Overview.html Overview.src.html 
Log Message:
r1701 - r1.889: Shun UTF-32. Make it slightly clearer what 'UTF-16' means.


Index: Overview.html
===================================================================
RCS file: /sources/public/html5/pubnotes/Overview.html,v
retrieving revision 1.211
retrieving revision 1.212
diff -u -d -r1.211 -r1.212
--- Overview.html	26 May 2008 04:24:49 -0000	1.211
+++ Overview.html	26 May 2008 07:04:38 -0000	1.212
@@ -579,12 +579,27 @@
             <p>In this section, the following changes were
             made:</p>
             <ul>
-              <li>a statement was added that <q>The
+              <li>A statement was added that <q>The
               <code class="domattribute">charset</code> attribute
               specifies the character encoding used by the document.
               This is called a character encoding
               declaration</q>.</li>
-              <li>The value <code>dns</code> was removed from the
+              <li>The text of the “Specifying the document’s
+              character encoding” subsection was refined, with the
+              following statements added:
+              <blockquote>
+                <p><q>If the document contains a meta element with
+                  a charset attribute or a meta element in the
+                  Encoding declaration state, then the character
+                  encoding used must be an ASCII-compatible
+                  character encoding.</q></p>
+                <p><q>An ASCII-compatible character encoding is one
+                  that is a superset of US-ASCII (specifically,
+                  ANSI_X3.4-1968) for bytes in the range 0x09 -
+                  0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F,
+                  0x41 - 0x5A, and 0x61 - 0x7A.</q></p>
+              </blockquote>
+              </li><li>The value <code>dns</code> was removed from the
               list of pre-defined values for the 
               <code class="domattribute">name</code>
               attribute.</li>
@@ -2416,6 +2431,9 @@
             finding the “sniffed type of a resource”, as well as
             to the “Content-Type sniffing: feed or HTML” and
             “Content-Type metadata” subsections.</li>
+            <li>References to UTF-32 were <strong>removed</strong>
+            from the table in the “Content-Type sniffing: text or
+            binary” subsection.</li>
             <li>An item for the <code>
               image/vnd.microsoft.icon</code> type was added in two
             tables that list byte sequences used in the
@@ -2817,13 +2835,32 @@
           checkers in parsing <code>text/html</code> content. In
           this section, the following changes were made:</p>
           <ul>
-            <li>In the parts of the “The input stream” subsection
-            that deal with preprocessing the input stream,
-            character encoding requirements, and determining the
-            character encoding of the input stream, a number of
-            changes were made, including the addition of a
-            clarification related to the <strong>source browsing
-              context</strong>.</li>
+            <li>In the “The input stream” subsection, the
+            following changes were made:
+              <ul>
+                <li>In the parts of the that deal with preprocessing
+                the input stream, character encoding requirements,
+                and determining the character encoding of the input
+                stream, a number of refinements were made, including
+                the addition of a clarification related to the
+                <strong>source browsing context</strong>.</li>
+                <li>The following note was added:
+                <blockquote>
+                  <p><q>This specification does not make any
+                    attempt to support UTF-32 in its algorithms;
+                    support and use of UTF-32 can thus lead to
+                    unexpected behavior in implementations of this
+                    specification.</q></p>
+                </blockquote></li>
+                <li>In the “Changing the encoding while parsing”
+                subsection, the first step in the algorithm for
+                changing the encoding, which had read, “If the new
+                encoding is UTF-16, change it to UTF-8”, was updated
+                to now read (changed text highlighted), <q>If the
+                  new encoding is <em class="highlight">a UTF-16
+                    encoding</em>, change it to UTF-8.</q></li>
+              </ul>
+            </li>
             <li>Significant revisions were made to the “Character
             encoding requirements” subsection, including the
             addition of a “Character encoding overrides” table,

Index: Overview.src.html
===================================================================
RCS file: /sources/public/html5/pubnotes/Overview.src.html,v
retrieving revision 1.205
retrieving revision 1.206
diff -u -d -r1.205 -r1.206
--- Overview.src.html	26 May 2008 04:24:49 -0000	1.205
+++ Overview.src.html	26 May 2008 07:04:38 -0000	1.206
@@ -564,11 +564,26 @@
             <p>In this section, the following changes were
             made:</p>
             <ul>
-              <li>a statement was added that <q>The
+              <li>A statement was added that <q>The
               <code class=domattribute>charset</code> attribute
               specifies the character encoding used by the document.
               This is called a character encoding
               declaration</q>.</li>
+              <li>The text of the “Specifying the document’s
+              character encoding” subsection was refined, with the
+              following statements added:
+              <blockquote>
+                <p><q>If the document contains a meta element with
+                  a charset attribute or a meta element in the
+                  Encoding declaration state, then the character
+                  encoding used must be an ASCII-compatible
+                  character encoding.</q></p>
+                <p><q>An ASCII-compatible character encoding is one
+                  that is a superset of US-ASCII (specifically,
+                  ANSI_X3.4-1968) for bytes in the range 0x09 -
+                  0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F,
+                  0x41 - 0x5A, and 0x61 - 0x7A.</q></p>
+              </blockquote>
               <li>The value <code>dns</code> was removed from the
               list of pre-defined values for the 
               <code class=domattribute>name</code>
@@ -2440,6 +2455,9 @@
             finding the “sniffed type of a resource”, as well as
             to the “Content-Type sniffing: feed or HTML” and
             “Content-Type metadata” subsections.</li>
+            <li>References to UTF-32 were <strong>removed</strong>
+            from the table in the “Content-Type sniffing: text or
+            binary” subsection.</li>
             <li>An item for the <code>
               image/vnd.microsoft.icon</code> type was added in two
             tables that list byte sequences used in the
@@ -2857,13 +2875,32 @@
           checkers in parsing <code>text/html</code> content. In
           this section, the following changes were made:</p>
           <ul>
-            <li>In the parts of the “The input stream” subsection
-            that deal with preprocessing the input stream,
-            character encoding requirements, and determining the
-            character encoding of the input stream, a number of
-            changes were made, including the addition of a
-            clarification related to the <strong>source browsing
-              context</strong>.</li>
+            <li>In the “The input stream” subsection, the
+            following changes were made:
+              <ul>
+                <li>In the parts of the that deal with preprocessing
+                the input stream, character encoding requirements,
+                and determining the character encoding of the input
+                stream, a number of refinements were made, including
+                the addition of a clarification related to the
+                <strong>source browsing context</strong>.</li>
+                <li>The following note was added:
+                <blockquote>
+                  <p><q>This specification does not make any
+                    attempt to support UTF-32 in its algorithms;
+                    support and use of UTF-32 can thus lead to
+                    unexpected behavior in implementations of this
+                    specification.</q></p>
+                </blockquote></li>
+                <li>In the “Changing the encoding while parsing”
+                subsection, the first step in the algorithm for
+                changing the encoding, which had read, “If the new
+                encoding is UTF-16, change it to UTF-8”, was updated
+                to now read (changed text highlighted), <q>If the
+                  new encoding is <em class=highlight>a UTF-16
+                    encoding</em>, change it to UTF-8.</q></li>
+              </ul>
+            </li>
             <li>Significant revisions were made to the “Character
             encoding requirements” subsection, including the
             addition of a “Character encoding overrides” table,
Received on Monday, 26 May 2008 07:05:17 UTC