- From: Michael Smith via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 26 May 2008 07:04:41 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/pubnotes
In directory hutz:/tmp/cvs-serv1753
Modified Files:
Overview.html Overview.src.html
Log Message:
r1701 - r1.889: Shun UTF-32. Make it slightly clearer what 'UTF-16' means.
Index: Overview.html
===================================================================
RCS file: /sources/public/html5/pubnotes/Overview.html,v
retrieving revision 1.211
retrieving revision 1.212
diff -u -d -r1.211 -r1.212
--- Overview.html 26 May 2008 04:24:49 -0000 1.211
+++ Overview.html 26 May 2008 07:04:38 -0000 1.212
@@ -579,12 +579,27 @@
<p>In this section, the following changes were
made:</p>
<ul>
- <li>a statement was added that <q>The
+ <li>A statement was added that <q>The
<code class="domattribute">charset</code> attribute
specifies the character encoding used by the document.
This is called a character encoding
declaration</q>.</li>
- <li>The value <code>dns</code> was removed from the
+ <li>The text of the “Specifying the document’s
+ character encoding” subsection was refined, with the
+ following statements added:
+ <blockquote>
+ <p><q>If the document contains a meta element with
+ a charset attribute or a meta element in the
+ Encoding declaration state, then the character
+ encoding used must be an ASCII-compatible
+ character encoding.</q></p>
+ <p><q>An ASCII-compatible character encoding is one
+ that is a superset of US-ASCII (specifically,
+ ANSI_X3.4-1968) for bytes in the range 0x09 -
+ 0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F,
+ 0x41 - 0x5A, and 0x61 - 0x7A.</q></p>
+ </blockquote>
+ </li><li>The value <code>dns</code> was removed from the
list of pre-defined values for the
<code class="domattribute">name</code>
attribute.</li>
@@ -2416,6 +2431,9 @@
finding the “sniffed type of a resource”, as well as
to the “Content-Type sniffing: feed or HTML” and
“Content-Type metadata” subsections.</li>
+ <li>References to UTF-32 were <strong>removed</strong>
+ from the table in the “Content-Type sniffing: text or
+ binary” subsection.</li>
<li>An item for the <code>
image/vnd.microsoft.icon</code> type was added in two
tables that list byte sequences used in the
@@ -2817,13 +2835,32 @@
checkers in parsing <code>text/html</code> content. In
this section, the following changes were made:</p>
<ul>
- <li>In the parts of the “The input stream” subsection
- that deal with preprocessing the input stream,
- character encoding requirements, and determining the
- character encoding of the input stream, a number of
- changes were made, including the addition of a
- clarification related to the <strong>source browsing
- context</strong>.</li>
+ <li>In the “The input stream” subsection, the
+ following changes were made:
+ <ul>
+ <li>In the parts of the that deal with preprocessing
+ the input stream, character encoding requirements,
+ and determining the character encoding of the input
+ stream, a number of refinements were made, including
+ the addition of a clarification related to the
+ <strong>source browsing context</strong>.</li>
+ <li>The following note was added:
+ <blockquote>
+ <p><q>This specification does not make any
+ attempt to support UTF-32 in its algorithms;
+ support and use of UTF-32 can thus lead to
+ unexpected behavior in implementations of this
+ specification.</q></p>
+ </blockquote></li>
+ <li>In the “Changing the encoding while parsing”
+ subsection, the first step in the algorithm for
+ changing the encoding, which had read, “If the new
+ encoding is UTF-16, change it to UTF-8”, was updated
+ to now read (changed text highlighted), <q>If the
+ new encoding is <em class="highlight">a UTF-16
+ encoding</em>, change it to UTF-8.</q></li>
+ </ul>
+ </li>
<li>Significant revisions were made to the “Character
encoding requirements” subsection, including the
addition of a “Character encoding overrides” table,
Index: Overview.src.html
===================================================================
RCS file: /sources/public/html5/pubnotes/Overview.src.html,v
retrieving revision 1.205
retrieving revision 1.206
diff -u -d -r1.205 -r1.206
--- Overview.src.html 26 May 2008 04:24:49 -0000 1.205
+++ Overview.src.html 26 May 2008 07:04:38 -0000 1.206
@@ -564,11 +564,26 @@
<p>In this section, the following changes were
made:</p>
<ul>
- <li>a statement was added that <q>The
+ <li>A statement was added that <q>The
<code class=domattribute>charset</code> attribute
specifies the character encoding used by the document.
This is called a character encoding
declaration</q>.</li>
+ <li>The text of the “Specifying the document’s
+ character encoding” subsection was refined, with the
+ following statements added:
+ <blockquote>
+ <p><q>If the document contains a meta element with
+ a charset attribute or a meta element in the
+ Encoding declaration state, then the character
+ encoding used must be an ASCII-compatible
+ character encoding.</q></p>
+ <p><q>An ASCII-compatible character encoding is one
+ that is a superset of US-ASCII (specifically,
+ ANSI_X3.4-1968) for bytes in the range 0x09 -
+ 0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F,
+ 0x41 - 0x5A, and 0x61 - 0x7A.</q></p>
+ </blockquote>
<li>The value <code>dns</code> was removed from the
list of pre-defined values for the
<code class=domattribute>name</code>
@@ -2440,6 +2455,9 @@
finding the “sniffed type of a resource”, as well as
to the “Content-Type sniffing: feed or HTML” and
“Content-Type metadata” subsections.</li>
+ <li>References to UTF-32 were <strong>removed</strong>
+ from the table in the “Content-Type sniffing: text or
+ binary” subsection.</li>
<li>An item for the <code>
image/vnd.microsoft.icon</code> type was added in two
tables that list byte sequences used in the
@@ -2857,13 +2875,32 @@
checkers in parsing <code>text/html</code> content. In
this section, the following changes were made:</p>
<ul>
- <li>In the parts of the “The input stream” subsection
- that deal with preprocessing the input stream,
- character encoding requirements, and determining the
- character encoding of the input stream, a number of
- changes were made, including the addition of a
- clarification related to the <strong>source browsing
- context</strong>.</li>
+ <li>In the “The input stream” subsection, the
+ following changes were made:
+ <ul>
+ <li>In the parts of the that deal with preprocessing
+ the input stream, character encoding requirements,
+ and determining the character encoding of the input
+ stream, a number of refinements were made, including
+ the addition of a clarification related to the
+ <strong>source browsing context</strong>.</li>
+ <li>The following note was added:
+ <blockquote>
+ <p><q>This specification does not make any
+ attempt to support UTF-32 in its algorithms;
+ support and use of UTF-32 can thus lead to
+ unexpected behavior in implementations of this
+ specification.</q></p>
+ </blockquote></li>
+ <li>In the “Changing the encoding while parsing”
+ subsection, the first step in the algorithm for
+ changing the encoding, which had read, “If the new
+ encoding is UTF-16, change it to UTF-8”, was updated
+ to now read (changed text highlighted), <q>If the
+ new encoding is <em class=highlight>a UTF-16
+ encoding</em>, change it to UTF-8.</q></li>
+ </ul>
+ </li>
<li>Significant revisions were made to the “Character
encoding requirements” subsection, including the
addition of a “Character encoding overrides” table,
Received on Monday, 26 May 2008 07:05:17 UTC