- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Tue, 12 Aug 2008 10:02:11 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv31833 Modified Files: Overview.html Log Message: Define the Content-Language pragma, since apparently ~1% of sites use it in some way or another. (whatwg r2057) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.1235 retrieving revision 1.1236 diff -u -d -r1.1235 -r1.1236 --- Overview.html 12 Aug 2008 09:32:34 -0000 1.1235 +++ Overview.html 12 Aug 2008 10:02:08 -0000 1.1236 @@ -7974,15 +7974,15 @@ <!-- technically this is redundant with the XML spec --> + <hr> + <p>To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has an <code title=attr-xml-lang><a href="#xmllang">xml:lang</a></code> attribute set or is an <a href="#html-elements" title="HTML elements">HTML element</a> and has a <code title=attr-lang><a href="#lang">lang</a></code> attribute set. That - attribute specifies the language of the node. If that attribute's value is - not a recognised language code, then it must be treated as an unknown - language (as if the value was the empty string). + attribute specifies the language of the node. <p>If both the <code title=attr-xml-lang><a href="#xmllang">xml:lang</a></code> attribute and the <code @@ -7994,11 +7994,20 @@ the element's language. <p>If no explicit language is given for the <a href="#root-element">root - element</a>, then language information from a higher-level protocol (such + element</a>, but there is a <a href="#document-wide">document-wide default + language</a> set, then that is the language of the node. + + <p>If there is no <a href="#document-wide">document-wide default + language</a>, then language information from a higher-level protocol (such as HTTP), if any, must be used as the final fallback language. In the absence of any language information, the default value is unknown (the empty string). + <p>If the resulting value is not a recognised language code, then it must + be treated as an unknown language (as if the value was the empty string). + + <hr> + <p>User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, or for dictionary selection). <!--User @@ -8881,7 +8890,7 @@ tokeniser had emitted a start tag token with the tag name "pre", then set the <a href="#html-0">HTML parser</a>'s <a href="#tokenization0">tokenization</a> stage's <a - href="#content3">content model flag</a> to <em>PLAINTEXT</em>. + href="#content4">content model flag</a> to <em>PLAINTEXT</em>. <li> <p>If <var title="">replace</var> is false, then: @@ -10208,7 +10217,8 @@ keywords defined for this attribute. The states given in the first cell of the rows with keywords give the states to which those keywords map.<!-- Some of the keywords are non-conforming, as - noted in the last column.--> + noted in the last column.--></p> + <!-- things that are neither conforming nor do anything are commented out --> <table> <thead> @@ -10217,12 +10227,13 @@ <th>Keywords <!-- <th>Notes--> - <tbody><!-- things that are neither conforming nor do anything are commented out + <tbody> <tr> - <td><span title="attr-meta-http-equiv-content-language">Content-Language</span> + <td><a href="#content3" + title=attr-meta-http-equiv-content-language>Content Language</a> + <td><code title="">Content-Language</code> - <td>Non-conforming [ XXX but maybe we should make this an alternative to <html lang="">? ] ---> + <!-- <td>Non-conforming --> <tr> <td><a href="#encoding" title=attr-meta-http-equiv-content-type>Encoding @@ -10299,6 +10310,62 @@ algorithm appropriate for that state, as described in the following list: <dl> + <dt><dfn id=content3 title=attr-meta-http-equiv-content-language>Content + language</dfn> + + <dd> + <p>This pragma sets the <dfn id=document-wide>document-wide default + language</dfn>. Until the pragma is successfully processed, there is no + <a href="#document-wide">document-wide default language</a>.</p> + + <ol> + <li> + <p>If another <code><a href="#meta0">meta</a></code> element in the <a + href="#content3" title=attr-meta-http-equiv-content-language>Content + Language state</a> has already been successfully processed (i.e. when + it was inserted the user agent processed it and reached the last step + of this list of steps), then abort these steps. + + <li> + <p>If the <code><a href="#meta0">meta</a></code> element has no <code + title=attr-meta-content><a href="#content1">content</a></code> + attribute, or if that attribute's value is the empty string, then + abort these steps. + + <li> + <p>Let <var title="">input</var> be the value of the element's <code + title=attr-meta-content><a href="#content1">content</a></code> + attribute. + + <li> + <p>Let <var title="">position</var> point at the first character of + <var title="">input</var>. + + <li> + <p><a href="#skip-whitespace">Skip whitespace</a>. + + <li> + <p><a href="#collect" title="collect a sequence of characters">Collect + a sequence of characters</a> that are neither <a href="#space" + title="space character">space characters</a> nor a U+002C COMMA + character (","). + + <li> + <p>Let the <a href="#document-wide">document-wide default language</a> + be the string that resulted from the previous step. + </ol> + + <p>For <code><a href="#meta0">meta</a></code> elements in the <a + href="#content3" title=attr-meta-http-equiv-content-language>Content + Language state</a>, the <code title=attr-meta-content><a + href="#content1">content</a></code> attribute must have a value + consisting of a valid RFC 3066 language code. <a + href="#references">[RFC3066]</a></p> + + <p class=note>This pragma not exactly equivalent to the HTTP + <code>Content-Language</code> header, for instance it only supports one + language. <a href="#references">[RFC2616]</a></p> + <dt><dfn id=encoding title=attr-meta-http-equiv-content-type>Encoding declaration state</dfn> @@ -36440,7 +36507,7 @@ title="HTML documents">HTML document</a>, create an <a href="#html-0">HTML parser</a>, associate it with the document, act as if the tokeniser had emitted a start tag token with the tag name "pre", set the <a - href="#tokenization0">tokenization</a> stage's <a href="#content3">content + href="#tokenization0">tokenization</a> stage's <a href="#content4">content model flag</a> to <i>PLAINTEXT</i>, and begin to pass the stream of characters in the plain text document to that tokeniser. @@ -46632,7 +46699,7 @@ to another state. <p>The exact behavior of certain states depends on a <dfn - id=content3>content model flag</dfn> that is set after certain tokens are + id=content4>content model flag</dfn> that is set after certain tokens are emitted. The flag has several states: <i title="">PCDATA</i>, <i title="">RCDATA</i>, <i title="">CDATA</i>, and <i title="">PLAINTEXT</i>. Initially it must be in the PCDATA state. In the RCDATA and CDATA states, @@ -46656,7 +46723,7 @@ <p>When a token is emitted, it must immediately be handled by the <a href="#tree-construction0">tree construction</a> stage. The tree - construction stage can affect the state of the <a href="#content3">content + construction stage can affect the state of the <a href="#content4">content model flag</a>, and can insert additional characters into the stream. (For example, the <code><a href="#script1">script</a></code> element can result in scripts executing and using the <a href="#dynamic3">dynamic markup @@ -46667,7 +46734,7 @@ flag">acknowledged</dfn> when it is processed by the tree construction stage, that is a <a href="#parse2">parse error</a>. - <p>When an end tag token is emitted, the <a href="#content3">content model + <p>When an end tag token is emitted, the <a href="#content4">content model flag</a> must be switched to the PCDATA state. <p>When an end tag token is emitted with attributes, that is a <a @@ -46698,7 +46765,7 @@ <dl class=switch> <dt>U+0026 AMPERSAND (&) - <dd>When the <a href="#content3">content model flag</a> is set to one of + <dd>When the <a href="#content4">content model flag</a> is set to one of the PCDATA or RCDATA states and the <a href="#escape">escape flag</a> is false: switch to the <a href="#character6">character reference data state</a>. @@ -46708,7 +46775,7 @@ <dt>U+002D HYPHEN-MINUS (-) <dd> - <p>If the <a href="#content3">content model flag</a> is set to either the + <p>If the <a href="#content4">content model flag</a> is set to either the RCDATA state or the CDATA state, and the <a href="#escape">escape flag</a> is false, and there are at least three characters before this one in the input stream, and the last four characters in the input @@ -46721,10 +46788,10 @@ <dt>U+003C LESS-THAN SIGN (<) - <dd>When the <a href="#content3">content model flag</a> is set to the + <dd>When the <a href="#content4">content model flag</a> is set to the PCDATA state: switch to the <a href="#tag-open0">tag open state</a>. - <dd>When the <a href="#content3">content model flag</a> is set to either + <dd>When the <a href="#content4">content model flag</a> is set to either the RCDATA state or the CDATA state and the <a href="#escape">escape flag</a> is false: switch to the <a href="#tag-open0">tag open state</a>. @@ -46733,7 +46800,7 @@ <dt>U+003E GREATER-THAN SIGN (>) <dd> - <p>If the <a href="#content3">content model flag</a> is set to either the + <p>If the <a href="#content4">content model flag</a> is set to either the RCDATA state or the CDATA state, and the <a href="#escape">escape flag</a> is true, and the last three characters in the input stream including this one are U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E @@ -46760,7 +46827,7 @@ <h5 id=character1><span class=secno>8.2.4.2. </span><dfn id=character6>Character reference data state</dfn></h5> - <p><em>(This cannot happen if the <a href="#content3">content model + <p><em>(This cannot happen if the <a href="#content4">content model flag</a> is set to the CDATA state.)</em> <p>Attempt to <a href="#consume">consume a character reference</a>, with no @@ -46775,11 +46842,11 @@ <h5 id=tag-open><span class=secno>8.2.4.3. </span><dfn id=tag-open0>Tag open state</dfn></h5> - <p>The behavior of this state depends on the <a href="#content3">content + <p>The behavior of this state depends on the <a href="#content4">content model flag</a>. <dl> - <dt>If the <a href="#content3">content model flag</a> is set to the RCDATA + <dt>If the <a href="#content4">content model flag</a> is set to the RCDATA or CDATA states <dd> @@ -46789,7 +46856,7 @@ and reconsume the current input character in the <a href="#data-state0">data state</a>.</p> - <dt>If the <a href="#content3">content model flag</a> is set to the PCDATA + <dt>If the <a href="#content4">content model flag</a> is set to the PCDATA state <dd> @@ -46842,10 +46909,10 @@ <h5 id=close><span class=secno>8.2.4.4. </span><dfn id=close4>Close tag open state</dfn></h5> - <p>If the <a href="#content3">content model flag</a> is set to the RCDATA + <p>If the <a href="#content4">content model flag</a> is set to the RCDATA or CDATA states but no start tag token has ever been emitted by this instance of the tokeniser (<a href="#fragment">fragment case</a>), or, if - the <a href="#content3">content model flag</a> is set to the RCDATA or + the <a href="#content4">content model flag</a> is set to the RCDATA or CDATA states and the next few characters do not match the tag name of the last start tag token emitted (compared in an <span>ASCII case insensitive</span> manner), or if they do but they are not immediately @@ -46872,7 +46939,7 @@ character token, and switch to the <a href="#data-state0">data state</a> to process the <a href="#next-input">next input character</a>. - <p>Otherwise, if the <a href="#content3">content model flag</a> is set to + <p>Otherwise, if the <a href="#content4">content model flag</a> is set to the PCDATA state, or if the next few characters <em>do</em> match that tag name, consume the <a href="#next-input">next input character</a>: @@ -47354,7 +47421,7 @@ <h5 id=bogus><span class=secno>8.2.4.16. </span><dfn id=bogus1>Bogus comment state</dfn></h5> - <p><em>(This can only happen if the <a href="#content3">content model + <p><em>(This can only happen if the <a href="#content4">content model flag</a> is set to the PCDATA state.)</em> <p>Consume every character up to and including the first U+003E @@ -47373,7 +47440,7 @@ <h5 id=markup><span class=secno>8.2.4.17. </span><dfn id=markup0>Markup declaration open state</dfn></h5> - <p><em>(This can only happen if the <a href="#content3">content model + <p><em>(This can only happen if the <a href="#content4">content model flag</a> is set to the PCDATA state.)</em> <p>If the next two characters are both U+002D HYPHEN-MINUS (-) characters, @@ -47393,7 +47460,7 @@ (the five uppercase letters "CDATA" with a U+005B LEFT SQUARE BRACKET character before and after), then consume those characters and switch to the <a href="#cdata2">CDATA section state</a> (which is unrelated to the - <a href="#content3">content model flag</a>'s CDATA state). + <a href="#content4">content model flag</a>'s CDATA state). <p>Otherwise, this is a <a href="#parse2">parse error</a>. Switch to the <a href="#bogus1">bogus comment state</a>. The next character that is @@ -48003,9 +48070,9 @@ <h5 id=cdata0><span class=secno>8.2.4.36. </span><dfn id=cdata2>CDATA section state</dfn></h5> - <p><em>(This can only happen if the <a href="#content3">content model + <p><em>(This can only happen if the <a href="#content4">content model flag</a> is set to the PCDATA state, and is unrelated to the <a - href="#content3">content model flag</a>'s CDATA state.)</em> + href="#content4">content model flag</a>'s CDATA state.)</em> <p>Consume every character up to the next occurrence of the three character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE BRACKET U+003E @@ -48718,10 +48785,10 @@ <li> <p>If the algorithm that was invoked is the <a href="#generic">generic CDATA element parsing algorithm</a>, switch the tokeniser's <a - href="#content3">content model flag</a> to the CDATA state; otherwise + href="#content4">content model flag</a> to the CDATA state; otherwise the algorithm invoked was the <a href="#generic0">generic RCDATA element parsing algorithm</a>, switch the tokeniser's <a - href="#content3">content model flag</a> to the RCDATA state. + href="#content4">content model flag</a> to the RCDATA state. <li> <p>Then, collect all the character tokens that the tokeniser returns @@ -48734,7 +48801,7 @@ all those tokens' characters, to the new element node. <li> - <p>The tokeniser's <a href="#content3">content model flag</a> will have + <p>The tokeniser's <a href="#content4">content model flag</a> will have switched back to the PCDATA state. <li> @@ -49366,7 +49433,7 @@ script will execute in-line, instead of blowing the document away, as would happen in most other cases.</p> - <p>Switch the tokeniser's <a href="#content3">content model flag</a> to + <p>Switch the tokeniser's <a href="#content4">content model flag</a> to the CDATA state.</p> <p>Then, collect all the character tokens that the tokeniser returns @@ -49378,7 +49445,7 @@ href="#script1">script</a></code> element node whose contents is the concatenation of all those tokens' characters.</p> - <p>The tokeniser's <a href="#content3">content model flag</a> will have + <p>The tokeniser's <a href="#content4">content model flag</a> will have switched back to the PCDATA state.</p> <p>If the next token is not an end tag token with the tag name "script", @@ -49949,13 +50016,13 @@ <p><a href="#insert0">Insert an HTML element</a> for the token.</p> - <p>Switch the <a href="#content3">content model flag</a> to the PLAINTEXT + <p>Switch the <a href="#content4">content model flag</a> to the PLAINTEXT state.</p> <p class=note>Once a start tag with the tag name "plaintext" has been seen, that will be the last token ever seen other than character tokens (and the end-of-file token), because there is no way to switch the <a - href="#content3">content model flag</a> out of the PLAINTEXT state.</p> + href="#content4">content model flag</a> out of the PLAINTEXT state.</p> </dd> <!-- end tags for non-phrasing flow content elements --> <!-- the normal ones --> @@ -50584,7 +50651,7 @@ <code>form</code> element pointed to by the <a href="#form-element"><code title="">form</code> element pointer</a>.</p> - <p>Switch the tokeniser's <a href="#content3">content model flag</a> to + <p>Switch the tokeniser's <a href="#content4">content model flag</a> to the RCDATA state.</p> <p>If the next token is a U+000A LINE FEED (LF) character token, then @@ -50599,7 +50666,7 @@ single <code>Text</code> node, whose contents is the concatenation of all those tokens' characters, to the new element node.</p> - <p>The tokeniser's <a href="#content3">content model flag</a> will have + <p>The tokeniser's <a href="#content4">content model flag</a> will have switched back to the PCDATA state.</p> <p>If the next token is an end tag token with the tag name "textarea", @@ -52512,14 +52579,14 @@ <li> <p>Set the <a href="#html-0">HTML parser</a>'s <a href="#tokenization0">tokenization</a> stage's <a - href="#content3">content model flag</a> according to the <var + href="#content4">content model flag</a> according to the <var title="">context</var> element, as follows:</p> <dl class=switch> <dt>If it is a <code><a href="#title1">title</a></code> or <code>textarea</code> element - <dd>Set the <a href="#content3">content model flag</a> to the RCDATA + <dd>Set the <a href="#content4">content model flag</a> to the RCDATA state. <dt>If it is a <code><a href="#style1">style</a></code>, <code><a @@ -52527,23 +52594,23 @@ href="#iframe">iframe</a></code>, <code>noembed</code>, or <code>noframes</code> element - <dd>Set the <a href="#content3">content model flag</a> to the CDATA + <dd>Set the <a href="#content4">content model flag</a> to the CDATA state. <dt>If it is a <code><a href="#noscript">noscript</a></code> element <dd>If the <a href="#scripting3">scripting flag</a> is enabled, set the - <a href="#content3">content model flag</a> to the CDATA state. - Otherwise, set the <a href="#content3">content model flag</a> to the + <a href="#content4">content model flag</a> to the CDATA state. + Otherwise, set the <a href="#content4">content model flag</a> to the PCDATA state. <dt>If it is a <code>plaintext</code> element - <dd>Set the <a href="#content3">content model flag</a> to PLAINTEXT. + <dd>Set the <a href="#content4">content model flag</a> to PLAINTEXT. <dt>Otherwise - <dd>Set the <a href="#content3">content model flag</a> to the PCDATA + <dd>Set the <a href="#content4">content model flag</a> to the PCDATA state. </dl>
Received on Tuesday, 12 August 2008 10:02:47 UTC