- From: poot <cvsmail@w3.org>
- Date: Wed, 29 Sep 2010 10:05:24 +0900 (JST)
- To: public-html-diffs@w3.org
hixie: Match Gecko for character encoding processing for <script> (whatwg r5545) http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.4431&r2=1.4432&f=h http://html5.org/tools/web-apps-tracker?from=5544&to=5545 =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.4431 retrieving revision 1.4432 diff -u -d -r1.4431 -r1.4432 --- Overview.html 29 Sep 2010 00:07:54 -0000 1.4431 +++ Overview.html 29 Sep 2010 01:05:06 -0000 1.4432 @@ -12390,10 +12390,12 @@ <code><a href="#document">Document</a></code> objects can also have this flag set; it's propagated to the <code><a href="#document">Document</a></code> when the script runs.</p> - <p>The fifth and sixth pieces of state are <dfn id="the-script-block-s-type"><var>the script - block's type</var></dfn> and <dfn id="the-script-block-s-character-encoding"><var>the script block's character - encoding</var></dfn>. They are determined when the script is run, - based on the attributes on the element at that time.</p> + <p>The last few pieces of state are <dfn id="the-script-block-s-type"><var>the script block's + type</var></dfn>, <dfn id="the-script-block-s-character-encoding"><var>the script block's character + encoding</var></dfn>, and <dfn id="the-script-block-s-fallback-character-encoding"><var>the script block's fallback + character encoding</var></dfn>. They are determined when the script + is run, based on the attributes on the element at that time, and the + <code><a href="#document">Document</a></code> of the <code><a href="#script">script</a></code> element.</p> <p>When a <code><a href="#script">script</a></code> element that is not marked as being <a href="#parser-inserted">"parser-inserted"</a> experiences one of the events listed @@ -12551,9 +12553,12 @@ <var><a href="#the-script-block-s-character-encoding">the script block's character encoding</a></var> for this <code><a href="#script">script</a></code> element be the encoding given by the <code title="attr-script-charset"><a href="#attr-script-charset">charset</a></code> attribute.</p> - <p>Otherwise, let <var><a href="#the-script-block-s-character-encoding">the script block's character encoding</a></var> - for this <code><a href="#script">script</a></code> element be the same as <a href="#document-s-character-encoding" title="document's character encoding">the encoding of the document - itself</a>.</p> + <p>Otherwise, let <var><a href="#the-script-block-s-fallback-character-encoding">the script block's fallback character + encoding</a></var> for this <code><a href="#script">script</a></code> element be the same as + <a href="#document-s-character-encoding" title="document's character encoding">the encoding of the + document itself</a>.</p> + + <p class="note">Only one of these two pieces of state is set.</p> </li> @@ -12580,13 +12585,6 @@ user agent must act as if it had received an empty HTTP 400 response.</p> - <p>Once the resource's <a href="#content-type" title="Content-Type">Content Type - metadata</a> is available, if it ever is, apply the - <a href="#algorithm-for-extracting-an-encoding-from-a-content-type">algorithm for extracting an encoding from a - Content-Type</a> to it. If this returns an encoding, and the - user agent supports that encoding, then let <var><a href="#the-script-block-s-character-encoding">the script - block's character encoding</a></var> be that encoding.</p> - <p>For performance reasons, user agents may start fetching the script as soon as the attribute is set, instead, in the hope that the element will be inserted into the document. Either way, once @@ -12733,43 +12731,63 @@ <p>The contents of that file, interpreted as string of Unicode characters, are the script source.</p> - <p>For each of the rows in the following table, starting with - the first one and going down, if the file has as many or more - bytes available than the number of bytes in the first column, - and the first bytes of the file match the bytes given in the - first column, then set <var><a href="#the-script-block-s-character-encoding">the script block's character - encoding</a></var> to the encoding given in the cell in the second - column of that row, irrespective of any previous value:</p> + <p>To obtain the string of Unicode characters, the user agent + run the following steps:</p> - <!-- this table is present in several forms in this file; keep them in sync --> - <table id="table-script-bom"><thead><tr><th>Bytes in Hexadecimal - <th>Encoding - <tbody><!-- nobody uses this - <tr> - <td>00 00 FE FF - <td>UTF-32BE - <tr> - <td>FF FE 00 00 - <td>UTF-32LE ---><tr><td>FE FF - <td>Big-endian UTF-16 - <tr><td>FF FE - <td>Little-endian UTF-16 - <tr><td>EF BB BF - <td>UTF-8 -<!-- nobody uses this - <tr> - <td>DD 73 66 73 - <td>UTF-EBCDIC ---> - </table><p class="note">This step looks for Unicode Byte Order Marks - (BOMs).</p> + <ol><li><p>If the resource's <a href="#content-type" title="Content-Type">Content + Type metadata</a>, if any, specifies a character encoding, + and the user agent supports that encoding, then let <var title="">character encoding</var> be that encoding, and jump + to the bottom step in this series of steps.</li> - <p>The file must then be converted to Unicode using the - character encoding given by <var><a href="#the-script-block-s-character-encoding">the script block's character - encoding</a></var>.</p> + <li><p>If the algorithm above set <var><a href="#the-script-block-s-character-encoding">the script block's + character encoding</a></var>, then let <var title="">character + encoding</var> be that encoding, and jump to the bottom step + in this series of steps.</li> - </dd> + <li><p>For each of the rows in the following table, starting + with the first one and going down, if the file has as many or + more bytes available than the number of bytes in the first + column, and the first bytes of the file match the bytes given + in the first column, then set <var title="">character + encoding</var> to the encoding given in the cell in the + second column of that row, and jump to the bottom step in + this series of steps:</p> + + <!-- this table is present in several forms in this file; keep them in sync --> + <table id="table-script-bom"><thead><tr><th>Bytes in Hexadecimal + <th>Encoding + <tbody><!-- nobody uses this + <tr> + <td>00 00 FE FF + <td>UTF-32BE + <tr> + <td>FF FE 00 00 + <td>UTF-32LE + --><tr><td>FE FF + <td>Big-endian UTF-16 + <tr><td>FF FE + <td>Little-endian UTF-16 + <tr><td>EF BB BF + <td>UTF-8 + <!-- nobody uses this + <tr> + <td>DD 73 66 73 + <td>UTF-EBCDIC + --> + </table><p class="note">This step looks for Unicode Byte Order Marks + (BOMs).</p> + + </li> + + <li><p>Let <var title="">character encoding</var> be <var><a href="#the-script-block-s-fallback-character-encoding">the + script block's fallback character encoding</a></var>.</li> + + <li><p>Convert the file to Unicode using <var>character + encoding</var>, following the rules for doing so given by the + specification for <var><a href="#the-script-block-s-type">the script block's + type</a></var>.</li> + + </ol></dd> <dt>If the script is from an external file and <var><a href="#the-script-block-s-type">the script block's type</a></var> is an XML-based language</dt>
Received on Wednesday, 29 September 2010 01:06:22 UTC