- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 12 Apr 2010 06:43:54 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv17485 Modified Files: Overview.html Log Message: Change how character encodings are sniffed to require an http-equiv attribute, and to only process one character encoding per <meta> element, even if attributes are duplicated. (whatwg r4993) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.4010 retrieving revision 1.4011 diff -u -d -r1.4010 -r1.4011 --- Overview.html 12 Apr 2010 05:48:38 -0000 1.4010 +++ Overview.html 12 Apr 2010 06:43:50 -0000 1.4011 @@ -52520,36 +52520,71 @@ 0x2F byte (the one in sequence of characters matched above).</li> - <li><p><a href="#concept-get-attributes-when-sniffing" title="concept-get-attributes-when-sniffing">Get - an attribute</a> and its value. If no attribute was - sniffed, then skip this inner set of steps, and jump to the - second step in the overall "two step" algorithm.</li> + <li><p>Let <var title="">attribute list</var> be an empty + list of strings.</li> <!-- so long as we only care about + http-equiv, content, and charset, this can be a 3-bit + bitfield --> - <li><p>If the attribute's name is neither "<code title="">charset</code>" nor "<code title="">content</code>", - then return to step 2 in these inner steps.</li> + <li><p>Let <var title="">got pragma</var> be false.</li> - <li><p>If the attribute's name is "<code title="">charset</code>", let <var title="">charset</var> be - the attribute's value, interpreted as a character - encoding.</li> + <li><p>Let <var title="">mode</var> be null.</li> - <li><p>Otherwise, the attribute's name is "<code title="">content</code>": apply the <a href="#algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for - extracting an encoding from a Content-Type</a>, giving the - attribute's value as the string to parse. If an encoding is - returned, let <var title="">charset</var> be that - encoding. Otherwise, return to step 2 in these inner - steps.</li> + <li><p>Let <var title="">charset</var> be the null value + (which, for the purposes of this algorithm, is distinct from + an unrecognised encoding or the empty string).</li> + + <li><p><i>Attributes</i>: <a href="#concept-get-attributes-when-sniffing" title="concept-get-attributes-when-sniffing">Get an + attribute</a> and its value. If no attribute was sniffed, + then jump to the <i>processing</i> step below.</li> + + <li><p>If the attribute's name is already in <var title="">attribute list</var>, then return to the step + labeled <i>attributes</i>.</p> + + <li> + + <p>Run the appropriate step from the following list, if one + applies:</p> + + <dl class="switch"><dt>If the attribute's name is "<code title="">http-equiv</code>"</dt> + + <dd><p>If the attribute's value is "<code title="">content-type</code>", then set <var title="">got + pragma</var> to true.</dd> + + <dt>If the attribute's name is "<code title="">charset</code>"</dt> + + <dd><p>If <var title="">charset</var> is still set to null, + let <var title="">charset</var> be the encoding + corresponding to the attribute's value, and set <var title="">mode</var> to "charset".</dd> + + <dt>If the attribute's name is "<code title="">content</code>"</dt> + + <dd><p>Apply the <a href="#algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding + from a Content-Type</a>, giving the attribute's value as + the string to parse. If an encoding is returned, and if + <var title="">charset</var> is still set to null, let <var title="">charset</var> be the encoding returned, and set + <var title="">mode</var> to "pragma".</dd> + + </dl></li> + + <li><p>Return to the step labeled <i>attributes</i>.</li> + + <li><p><i>Processing</i>: If <var title="">mode</var> is + null, then jump to the second step of the overall "two step" + algorithm.</li> + + <li><p>If <var title="">mode</var> is "pragma" but <var title="">got pragma</var> is false, then jump to the second + step of the overall "two step" algorithm.</li> <li><p>If <var title="">charset</var> is a UTF-16 encoding, change the value of <var title="">charset</var> to UTF-8.</li> - <li><p>If <var title="">charset</var> is a supported - character encoding, then return the given encoding, with - <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> - <i>tentative</i>, and abort all these steps.</li> + <li><p>If <var title="">charset</var> is not a supported + character encoding, then jump to the second step of the + overall "two step" algorithm.</li> - <li><p>Otherwise, return to step 2 in these inner - steps.</li> + <li><p>Return the encoding given by <var title="">charset</var>, with <a href="#concept-encoding-confidence" title="concept-encoding-confidence">confidence</a> + <i>tentative</i>, and abort all these steps.</li> </ol></dd>
Received on Monday, 12 April 2010 06:43:58 UTC