W3C home > Mailing lists > Public > public-html-diffs@w3.org > February 2012

hixie: Move a section so that the character encoding requirements are closer together. (whatwg r6992)

From: poot <cvsmail@w3.org>
Date: Mon, 13 Feb 2012 17:50:28 -0500
To: public-html-diffs@w3.org
Message-Id: <E1Rx4j6-0001sU-KD@jay.w3.org>
hixie: Move a section so that the character encoding requirements are
closer together. (whatwg r6992)

http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.5584&r2=1.5585&f=h
http://html5.org/tools/web-apps-tracker?from=6991&to=6992

===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.5584
retrieving revision 1.5585
diff -u -d -r1.5584 -r1.5585
--- Overview.html	13 Feb 2012 22:48:18 -0000	1.5584
+++ Overview.html	13 Feb 2012 22:50:16 -0000	1.5585
@@ -1157,8 +1157,8 @@
       <ol>
        <li><a href="#determining-the-character-encoding"><span class="secno">8.2.2.1 </span>Determining the character encoding</a></li>
        <li><a href="#character-encodings-0"><span class="secno">8.2.2.2 </span>Character encodings</a></li>
-       <li><a href="#preprocessing-the-input-stream"><span class="secno">8.2.2.3 </span>Preprocessing the input stream</a></li>
-       <li><a href="#changing-the-encoding-while-parsing"><span class="secno">8.2.2.4 </span>Changing the encoding while parsing</a></ol></li>
+       <li><a href="#changing-the-encoding-while-parsing"><span class="secno">8.2.2.3 </span>Changing the encoding while parsing</a></li>
+       <li><a href="#preprocessing-the-input-stream"><span class="secno">8.2.2.4 </span>Preprocessing the input stream</a></ol></li>
      <li><a href="#parse-state"><span class="secno">8.2.3 </span>Parse state</a>
       <ol>
        <li><a href="#the-insertion-mode"><span class="secno">8.2.3.1 </span>The insertion mode</a></li>
@@ -58895,7 +58895,59 @@
 
 
 
-  <h5 id="preprocessing-the-input-stream"><span class="secno">8.2.2.3 </span>Preprocessing the input stream</h5>
+  <h5 id="changing-the-encoding-while-parsing"><span class="secno">8.2.2.3 </span>Changing the encoding while parsing</h5>
+
+  <p>When the parser requires the user agent to <dfn id="change-the-encoding">change the
+  encoding</dfn>, it must run the following steps. This might happen
+  if the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> described above
+  failed to find an encoding, or if it found an encoding that was not
+  the actual encoding of the file.</p>
+
+  <ol><li>If the encoding that is already being used to interpret the
+   input stream is <a href="#a-utf-16-encoding">a UTF-16 encoding</a>, then set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
+   <i>certain</i> and abort these steps. The new encoding is ignored;
+   if it was anything but the same encoding, then it would be clearly
+   incorrect.</li>
+
+   <li>If the new encoding is <a href="#a-utf-16-encoding">a UTF-16 encoding</a>, change
+   it to UTF-8.</li>
+
+   <li>If the new encoding is identical or equivalent to the encoding
+   that is already being used to interpret the input stream, then set
+   the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
+   <i>certain</i> and abort these steps. This happens when the
+   encoding information found in the file matches what the
+   <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> determined to be the
+   encoding, and in the second pass through the parser if the first
+   pass found that the encoding sniffing algorithm described in the
+   earlier section failed to find the right encoding.</li>
+
+   <li>If all the bytes up to the last byte converted by the current
+   decoder have the same Unicode interpretations in both the current
+   encoding and the new encoding, and if the user agent supports
+   changing the converter on the fly, then the user agent may change
+   to the new converter for the encoding on the fly. Set the
+   <a href="#document-s-character-encoding">document's character encoding</a> and the encoding used to
+   convert the input stream to the new encoding, set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
+   <i>certain</i>, and abort these steps.</li>
+
+   <li>Otherwise, <a href="#navigate">navigate</a> to the
+   document again, with <a href="#replacement-enabled">replacement enabled</a>, and using
+   the same <a href="#source-browsing-context">source browsing context</a>, but this time skip
+   the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> and instead just set
+   the encoding to the new encoding and the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
+   <i>certain</i>. Whenever possible, this should be done without
+   actually contacting the network layer (the bytes should be
+   re-parsed from memory), even if, e.g., the document is marked as
+   not being cacheable. If this is not possible and contacting the
+   network layer would involve repeating a request that uses a method
+   other than HTTP GET (<a href="#concept-http-equivalent-get" title="concept-http-equivalent-get">or
+   equivalent</a> for non-HTTP URLs), then instead set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
+   <i>certain</i> and ignore the new encoding. The resource will be
+   misinterpreted. User agents may notify the user of the situation,
+   to aid in application development.</li>
+
+  </ol><h5 id="preprocessing-the-input-stream"><span class="secno">8.2.2.4 </span>Preprocessing the input stream</h5>
 
   <p>The <dfn id="input-stream">input stream</dfn> consists of the characters pushed
   into it as the <a href="#the-input-byte-stream">input byte stream</a> is decoded or from the
@@ -58952,60 +59004,7 @@
   consumed. Otherwise, the "EOF" character is not a real character in
   the stream, but rather the lack of any further characters.</p>
 
-
-  <h5 id="changing-the-encoding-while-parsing"><span class="secno">8.2.2.4 </span>Changing the encoding while parsing</h5>
-
-  <p>When the parser requires the user agent to <dfn id="change-the-encoding">change the
-  encoding</dfn>, it must run the following steps. This might happen
-  if the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> described above
-  failed to find an encoding, or if it found an encoding that was not
-  the actual encoding of the file.</p>
-
-  <ol><li>If the encoding that is already being used to interpret the
-   input stream is <a href="#a-utf-16-encoding">a UTF-16 encoding</a>, then set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
-   <i>certain</i> and abort these steps. The new encoding is ignored;
-   if it was anything but the same encoding, then it would be clearly
-   incorrect.</li>
-
-   <li>If the new encoding is <a href="#a-utf-16-encoding">a UTF-16 encoding</a>, change
-   it to UTF-8.</li>
-
-   <li>If the new encoding is identical or equivalent to the encoding
-   that is already being used to interpret the input stream, then set
-   the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
-   <i>certain</i> and abort these steps. This happens when the
-   encoding information found in the file matches what the
-   <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> determined to be the
-   encoding, and in the second pass through the parser if the first
-   pass found that the encoding sniffing algorithm described in the
-   earlier section failed to find the right encoding.</li>
-
-   <li>If all the bytes up to the last byte converted by the current
-   decoder have the same Unicode interpretations in both the current
-   encoding and the new encoding, and if the user agent supports
-   changing the converter on the fly, then the user agent may change
-   to the new converter for the encoding on the fly. Set the
-   <a href="#document-s-character-encoding">document's character encoding</a> and the encoding used to
-   convert the input stream to the new encoding, set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
-   <i>certain</i>, and abort these steps.</li>
-
-   <li>Otherwise, <a href="#navigate">navigate</a> to the
-   document again, with <a href="#replacement-enabled">replacement enabled</a>, and using
-   the same <a href="#source-browsing-context">source browsing context</a>, but this time skip
-   the <a href="#encoding-sniffing-algorithm">encoding sniffing algorithm</a> and instead just set
-   the encoding to the new encoding and the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
-   <i>certain</i>. Whenever possible, this should be done without
-   actually contacting the network layer (the bytes should be
-   re-parsed from memory), even if, e.g., the document is marked as
-   not being cacheable. If this is not possible and contacting the
-   network layer would involve repeating a request that uses a method
-   other than HTTP GET (<a href="#concept-http-equivalent-get" title="concept-http-equivalent-get">or
-   equivalent</a> for non-HTTP URLs), then instead set the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> to
-   <i>certain</i> and ignore the new encoding. The resource will be
-   misinterpreted. User agents may notify the user of the situation,
-   to aid in application development.</li>
-
-  </ol></div><div class="impl">
+  </div><div class="impl">
 
   <h4 id="parse-state"><span class="secno">8.2.3 </span>Parse state</h4>
Received on Monday, 13 February 2012 22:50:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 13 February 2012 22:50:31 GMT