hixie: Change the limit for where charsets should be given to the first 1024 bytes. (whatwg r5860)

hixie: Change the limit for where charsets should be given to the first
1024 bytes. (whatwg r5860)

http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.4702&r2=1.4703&f=h
http://html5.org/tools/web-apps-tracker?from=5859&to=5860

===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.4702
retrieving revision 1.4703
diff -u -d -r1.4702 -r1.4703
--- Overview.html 8 Feb 2011 22:54:47 -0000 1.4702
+++ Overview.html 9 Feb 2011 00:02:11 -0000 1.4703
@@ -343,7 +343,7 @@
 
    <h1>HTML5</h1>
    <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
-   <h2 class="no-num no-toc" id="editor-s-draft-8-february-2011">Editor's Draft 8 February 2011</h2>
+   <h2 class="no-num no-toc" id="editor-s-draft-9-february-2011">Editor's Draft 9 February 2011</h2>
    <dl><dt>Latest Published Version:</dt>
     <dd><a href="http://www.w3.org/TR/html5/">http://www.w3.org/TR/html5/</a></dd>
     <dt>Latest Editor's Draft:</dt>
@@ -478,7 +478,7 @@
   Group</a> is the W3C working group responsible for this
   specification's progress along the W3C Recommendation
   track.
-  This specification is the 8 February 2011 Editor's Draft.
+  This specification is the 9 February 2011 Editor's Draft.
   </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>Work on this specification is also done at the <a href="http://www.whatwg.org/">WHATWG</a>. The W3C HTML working group
   actively pursues convergence with the WHATWG, as required by the <a href="http://www.w3.org/2007/03/HTML-WG-charter">W3C HTML working
   group charter</a>.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate --><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5
@@ -12329,9 +12329,10 @@
    the use of <a href="#syntax-charref" title="syntax-charref">character references</a>
    or character escapes of any kind.</li>
 
-   <li id="charset512">The element containing the character encoding
-   declaration must be serialized completely within the first 512
-   bytes of the document.</li>
+   <li id="charset1024"><span id="charset512" title="">The element
+   containing the character encoding declaration must be serialized
+   completely within the first 1024 bytes of the document.</span></li>
+   <!-- span is for historical reasons, to keep an old ID alive -->
 
    <li>There can only be one character encoding declaration in the
    document.</li> <!-- conformance criteria for this one are given in
@@ -54740,16 +54741,26 @@
    supported, return that encoding with the <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a>
    <i>certain</i>, and abort these steps.</li>
 
-   <li><p>The user agent may wait for more bytes of the resource to be
-   available, either in this step or at any later step in this
-   algorithm. For instance, a user agent might wait 500ms or 512
-   bytes, whichever came first. In general preparsing the source to
-   find the encoding improves performance, as it reduces the need to
-   throw away the data structures used when parsing upon finding the
-   encoding information. However, if the user agent delays too long to
-   obtain data to determine the encoding, then the cost of the delay
-   could outweigh any performance improvements from the
-   preparse.</li>
+   <li>
+
+    <p>The user agent may wait for more bytes of the resource to be
+    available, either in this step or at any later step in this
+    algorithm. For instance, a user agent might wait 500ms or 1024
+    bytes, whichever came first. In general preparsing the source to
+    find the encoding improves performance, as it reduces the need to
+    throw away the data structures used when parsing upon finding the
+    encoding information. However, if the user agent delays too long
+    to obtain data to determine the encoding, then the cost of the
+    delay could outweigh any performance improvements from the
+    preparse.</p>
+
+    <p class="note">The authoring conformance requirements for
+    character encoding declarations limit them to only appearing <a href="#charset1024">in the first 1024 bytes</a>. User agents are
+    therefore encouraged to use the preparse algorithm below (part of
+    these steps) on the first 1024 bytes, but not to stall beyond
+    that.</p>
+
+   </li>
 
    <li><p>For each of the rows in the following table, starting with
    the first one and going down, if there are as many or more bytes

Received on Wednesday, 9 February 2011 00:03:27 UTC