- From: Michael Smith via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 26 May 2008 07:04:41 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/pubnotes In directory hutz:/tmp/cvs-serv1753 Modified Files: Overview.html Overview.src.html Log Message: r1701 - r1.889: Shun UTF-32. Make it slightly clearer what 'UTF-16' means. Index: Overview.html =================================================================== RCS file: /sources/public/html5/pubnotes/Overview.html,v retrieving revision 1.211 retrieving revision 1.212 diff -u -d -r1.211 -r1.212 --- Overview.html 26 May 2008 04:24:49 -0000 1.211 +++ Overview.html 26 May 2008 07:04:38 -0000 1.212 @@ -579,12 +579,27 @@ <p>In this section, the following changes were made:</p> <ul> - <li>a statement was added that <q>The + <li>A statement was added that <q>The <code class="domattribute">charset</code> attribute specifies the character encoding used by the document. This is called a character encoding declaration</q>.</li> - <li>The value <code>dns</code> was removed from the + <li>The text of the “Specifying the document’s + character encoding” subsection was refined, with the + following statements added: + <blockquote> + <p><q>If the document contains a meta element with + a charset attribute or a meta element in the + Encoding declaration state, then the character + encoding used must be an ASCII-compatible + character encoding.</q></p> + <p><q>An ASCII-compatible character encoding is one + that is a superset of US-ASCII (specifically, + ANSI_X3.4-1968) for bytes in the range 0x09 - + 0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F, + 0x41 - 0x5A, and 0x61 - 0x7A.</q></p> + </blockquote> + </li><li>The value <code>dns</code> was removed from the list of pre-defined values for the <code class="domattribute">name</code> attribute.</li> @@ -2416,6 +2431,9 @@ finding the “sniffed type of a resource”, as well as to the “Content-Type sniffing: feed or HTML” and “Content-Type metadata” subsections.</li> + <li>References to UTF-32 were <strong>removed</strong> + from the table in the “Content-Type sniffing: text or + binary” subsection.</li> <li>An item for the <code> image/vnd.microsoft.icon</code> type was added in two tables that list byte sequences used in the @@ -2817,13 +2835,32 @@ checkers in parsing <code>text/html</code> content. In this section, the following changes were made:</p> <ul> - <li>In the parts of the “The input stream” subsection - that deal with preprocessing the input stream, - character encoding requirements, and determining the - character encoding of the input stream, a number of - changes were made, including the addition of a - clarification related to the <strong>source browsing - context</strong>.</li> + <li>In the “The input stream” subsection, the + following changes were made: + <ul> + <li>In the parts of the that deal with preprocessing + the input stream, character encoding requirements, + and determining the character encoding of the input + stream, a number of refinements were made, including + the addition of a clarification related to the + <strong>source browsing context</strong>.</li> + <li>The following note was added: + <blockquote> + <p><q>This specification does not make any + attempt to support UTF-32 in its algorithms; + support and use of UTF-32 can thus lead to + unexpected behavior in implementations of this + specification.</q></p> + </blockquote></li> + <li>In the “Changing the encoding while parsing” + subsection, the first step in the algorithm for + changing the encoding, which had read, “If the new + encoding is UTF-16, change it to UTF-8”, was updated + to now read (changed text highlighted), <q>If the + new encoding is <em class="highlight">a UTF-16 + encoding</em>, change it to UTF-8.</q></li> + </ul> + </li> <li>Significant revisions were made to the “Character encoding requirements” subsection, including the addition of a “Character encoding overrides” table, Index: Overview.src.html =================================================================== RCS file: /sources/public/html5/pubnotes/Overview.src.html,v retrieving revision 1.205 retrieving revision 1.206 diff -u -d -r1.205 -r1.206 --- Overview.src.html 26 May 2008 04:24:49 -0000 1.205 +++ Overview.src.html 26 May 2008 07:04:38 -0000 1.206 @@ -564,11 +564,26 @@ <p>In this section, the following changes were made:</p> <ul> - <li>a statement was added that <q>The + <li>A statement was added that <q>The <code class=domattribute>charset</code> attribute specifies the character encoding used by the document. This is called a character encoding declaration</q>.</li> + <li>The text of the “Specifying the document’s + character encoding” subsection was refined, with the + following statements added: + <blockquote> + <p><q>If the document contains a meta element with + a charset attribute or a meta element in the + Encoding declaration state, then the character + encoding used must be an ASCII-compatible + character encoding.</q></p> + <p><q>An ASCII-compatible character encoding is one + that is a superset of US-ASCII (specifically, + ANSI_X3.4-1968) for bytes in the range 0x09 - + 0x0D, 0x20, 0x21, 0x22, 0x26, 0x27, 0x2C - 0x3F, + 0x41 - 0x5A, and 0x61 - 0x7A.</q></p> + </blockquote> <li>The value <code>dns</code> was removed from the list of pre-defined values for the <code class=domattribute>name</code> @@ -2440,6 +2455,9 @@ finding the “sniffed type of a resource”, as well as to the “Content-Type sniffing: feed or HTML” and “Content-Type metadata” subsections.</li> + <li>References to UTF-32 were <strong>removed</strong> + from the table in the “Content-Type sniffing: text or + binary” subsection.</li> <li>An item for the <code> image/vnd.microsoft.icon</code> type was added in two tables that list byte sequences used in the @@ -2857,13 +2875,32 @@ checkers in parsing <code>text/html</code> content. In this section, the following changes were made:</p> <ul> - <li>In the parts of the “The input stream” subsection - that deal with preprocessing the input stream, - character encoding requirements, and determining the - character encoding of the input stream, a number of - changes were made, including the addition of a - clarification related to the <strong>source browsing - context</strong>.</li> + <li>In the “The input stream” subsection, the + following changes were made: + <ul> + <li>In the parts of the that deal with preprocessing + the input stream, character encoding requirements, + and determining the character encoding of the input + stream, a number of refinements were made, including + the addition of a clarification related to the + <strong>source browsing context</strong>.</li> + <li>The following note was added: + <blockquote> + <p><q>This specification does not make any + attempt to support UTF-32 in its algorithms; + support and use of UTF-32 can thus lead to + unexpected behavior in implementations of this + specification.</q></p> + </blockquote></li> + <li>In the “Changing the encoding while parsing” + subsection, the first step in the algorithm for + changing the encoding, which had read, “If the new + encoding is UTF-16, change it to UTF-8”, was updated + to now read (changed text highlighted), <q>If the + new encoding is <em class=highlight>a UTF-16 + encoding</em>, change it to UTF-8.</q></li> + </ul> + </li> <li>Significant revisions were made to the “Character encoding requirements” subsection, including the addition of a “Character encoding overrides” table,
Received on Monday, 26 May 2008 07:05:17 UTC