- From: poot <cvsmail@w3.org>
- Date: Wed, 14 Apr 2010 12:07:20 +0900 (JST)
- To: public-html-diffs@w3.org
hixie: Move the Content-Type encoding parsing hack of an algorithm back into HTML5 from MIMESNIFF. (whatwg r5042) http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.4056&r2=1.4057&f=h http://html5.org/tools/web-apps-tracker?from=5041&to=5042 =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.4056 retrieving revision 1.4057 diff -u -d -r1.4056 -r1.4057 --- Overview.html 13 Apr 2010 22:57:06 -0000 1.4056 +++ Overview.html 14 Apr 2010 03:06:58 -0000 1.4057 @@ -285,7 +285,7 @@ <h1>HTML5</h1> <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2> - <h2 class="no-num no-toc" id="editor-s-draft-13-april-2010">Editor's Draft 13 April 2010</h2> + <h2 class="no-num no-toc" id="editor-s-draft-14-april-2010">Editor's Draft 14 April 2010</h2> <dl><dt>Latest Published Version:</dt> <dd><a href="http://www.w3.org/TR/html5/">http://www.w3.org/TR/html5/</a></dd> <dt>Latest Editor's Draft:</dt> @@ -392,7 +392,7 @@ specification's progress along the W3C Recommendation track. - This specification is the 13 April 2010 Editor's Draft. + This specification is the 14 April 2010 Editor's Draft. </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>The contents of this specification are also part of <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/">a specification</a> published by the <a href="http://www.whatwg.org/">WHATWG</a>, which is available under a license that permits reuse of the specification text.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate --><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 @@ -5545,12 +5545,6 @@ with the requirements of the Content-Type Processing Model specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p> - <p>The <dfn id="algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a - Content-Type</dfn>, given a string <var title="">s</var>, is given - in the Content-Type Processing Model specification. It either - returns an encoding or nothing. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p> - <p class="XXX">The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p> - <p>The <dfn id="content-type-sniffing-0" title="Content-Type sniffing">sniffed type of a resource</dfn> must be found in a manner consistent with the requirements given in the Content-Type Processing Model @@ -5571,6 +5565,50 @@ occur. For more details, see the Content-Type Processing Model specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p> + <p>The <dfn id="algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a + Content-Type</dfn>, given a string <var title="">s</var>, is as + follows. It either returns an encoding or nothing.</p> + + <ol><li><p>Find the first seven characters in <var title="">s</var> + that are an <a href="#ascii-case-insensitive">ASCII case-insensitive</a> match for the word + "<code title="">charset</code>". If no such match is found, return + nothing.</li> + + <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 + characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li> + + <li><p>If the next character is not a U+003D EQUALS SIGN ('='), + return nothing and abort these steps.</li> + + <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020 + characters that immediately follow the equals sign (there might not + be any).</li> + + <li> + + <p>Process the next character as follows:</p> + + <dl class="switch"><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt> + <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt> + <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd> + + <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt> + <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt> + <dt>If there is no next character</dt> + <dd>Return nothing.</dd> + + <dt>Otherwise</dt> + <dd>Return the encoding corresponding to the string from this + character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or + U+003B character or the end of <var title="">s</var>, whichever + comes first.</dd> + + </dl></li> + + </ol><p class="note">This requirement is a <a href="#willful-violation">willful violation</a> + of the HTTP specification, motivated by the need for backwards + compatibility with legacy content. <a href="#refsHTTP">[HTTP]</a></p> + </div><h3 id="common-dom-interfaces"><span class="secno">2.7 </span>Common DOM interfaces</h3><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><h4 id="reflecting-content-attributes-in-idl-attributes"><span class="secno">2.7.1 </span>Reflecting content attributes in IDL attributes</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>Some IDL attributes are defined to <dfn id="reflect">reflect</dfn> a particular content attribute. This means that on getting, the IDL attribute returns the current value of the content attribute, and on
Received on Wednesday, 14 April 2010 03:07:48 UTC