- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Wed, 14 Apr 2010 03:07:02 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec
In directory hutz:/tmp/cvs-serv7455
Modified Files:
Overview.html
Log Message:
Move the Content-Type encoding parsing hack of an algorithm back into HTML5 from MIMESNIFF. (whatwg r5042)
Index: Overview.html
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.4056
retrieving revision 1.4057
diff -u -d -r1.4056 -r1.4057
--- Overview.html 13 Apr 2010 22:57:06 -0000 1.4056
+++ Overview.html 14 Apr 2010 03:06:58 -0000 1.4057
@@ -285,7 +285,7 @@
<h1>HTML5</h1>
<h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
- <h2 class="no-num no-toc" id="editor-s-draft-13-april-2010">Editor's Draft 13 April 2010</h2>
+ <h2 class="no-num no-toc" id="editor-s-draft-14-april-2010">Editor's Draft 14 April 2010</h2>
<dl><dt>Latest Published Version:</dt>
<dd><a href="http://www.w3.org/TR/html5/">http://www.w3.org/TR/html5/</a></dd>
<dt>Latest Editor's Draft:</dt>
@@ -392,7 +392,7 @@
specification's progress along the W3C Recommendation
track.
- This specification is the 13 April 2010 Editor's Draft.
+ This specification is the 14 April 2010 Editor's Draft.
</p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>The contents of this specification are also part of <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/">a
specification</a> published by the <a href="http://www.whatwg.org/">WHATWG</a>, which is available under a
license that permits reuse of the specification text.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate --><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5
@@ -5545,12 +5545,6 @@
with the requirements of the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
- <p>The <dfn id="algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a
- Content-Type</dfn>, given a string <var title="">s</var>, is given
- in the Content-Type Processing Model specification. It either
- returns an encoding or nothing. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
- <p class="XXX">The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
<p>The <dfn id="content-type-sniffing-0" title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -5571,6 +5565,50 @@
occur. For more details, see the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
+ <p>The <dfn id="algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a
+ Content-Type</dfn>, given a string <var title="">s</var>, is as
+ follows. It either returns an encoding or nothing.</p>
+
+ <ol><li><p>Find the first seven characters in <var title="">s</var>
+ that are an <a href="#ascii-case-insensitive">ASCII case-insensitive</a> match for the word
+ "<code title="">charset</code>". If no such match is found, return
+ nothing.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>
+
+ <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+ return nothing and abort these steps.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the equals sign (there might not
+ be any).</li>
+
+ <li>
+
+ <p>Process the next character as follows:</p>
+
+ <dl class="switch"><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+ <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+ <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+ <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+ <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+ <dt>If there is no next character</dt>
+ <dd>Return nothing.</dd>
+
+ <dt>Otherwise</dt>
+ <dd>Return the encoding corresponding to the string from this
+ character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+ U+003B character or the end of <var title="">s</var>, whichever
+ comes first.</dd>
+
+ </dl></li>
+
+ </ol><p class="note">This requirement is a <a href="#willful-violation">willful violation</a>
+ of the HTTP specification, motivated by the need for backwards
+ compatibility with legacy content. <a href="#refsHTTP">[HTTP]</a></p>
+
</div><h3 id="common-dom-interfaces"><span class="secno">2.7 </span>Common DOM interfaces</h3><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><h4 id="reflecting-content-attributes-in-idl-attributes"><span class="secno">2.7.1 </span>Reflecting content attributes in IDL attributes</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>Some IDL attributes are defined to <dfn id="reflect">reflect</dfn> a
particular content attribute. This means that on getting, the IDL
attribute returns the current value of the content attribute, and on
Received on Wednesday, 14 April 2010 03:07:03 UTC