html5/spec Overview.html,1.1235,1.1236

Update of /sources/public/html5/spec
In directory hutz:/tmp/cvs-serv31833

Modified Files:
	Overview.html 
Log Message:
Define the Content-Language pragma, since apparently ~1% of sites use it in some way or another. (whatwg r2057)

Index: Overview.html
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.1235
retrieving revision 1.1236
diff -u -d -r1.1235 -r1.1236
--- Overview.html	12 Aug 2008 09:32:34 -0000	1.1235
+++ Overview.html	12 Aug 2008 10:02:08 -0000	1.1236
@@ -7974,15 +7974,15 @@
   <!-- technically this is redundant
   with the XML spec -->
 
+  <hr>
+
   <p>To determine the language of a node, user agents must look at the
    nearest ancestor element (including the element itself if the node is an
    element) that has an <code title=attr-xml-lang><a
    href="#xmllang">xml:lang</a></code> attribute set or is an <a
    href="#html-elements" title="HTML elements">HTML element</a> and has a
    <code title=attr-lang><a href="#lang">lang</a></code> attribute set. That
-   attribute specifies the language of the node. If that attribute's value is
-   not a recognised language code, then it must be treated as an unknown
-   language (as if the value was the empty string).
+   attribute specifies the language of the node.
 
   <p>If both the <code title=attr-xml-lang><a
    href="#xmllang">xml:lang</a></code> attribute and the <code
@@ -7994,11 +7994,20 @@
    the element's language.
 
   <p>If no explicit language is given for the <a href="#root-element">root
-   element</a>, then language information from a higher-level protocol (such
+   element</a>, but there is a <a href="#document-wide">document-wide default
+   language</a> set, then that is the language of the node.
+
+  <p>If there is no <a href="#document-wide">document-wide default
+   language</a>, then language information from a higher-level protocol (such
    as HTTP), if any, must be used as the final fallback language. In the
    absence of any language information, the default value is unknown (the
    empty string).
 
+  <p>If the resulting value is not a recognised language code, then it must
+   be treated as an unknown language (as if the value was the empty string).
+
+  <hr>
+
   <p>User agents may use the element's language to determine proper
    processing or rendering (e.g. in the selection of appropriate fonts or
    pronunciations, or for dictionary selection). <!--User
@@ -8881,7 +8890,7 @@
      tokeniser had emitted a start tag token with the tag name "pre", then
      set the <a href="#html-0">HTML parser</a>'s <a
      href="#tokenization0">tokenization</a> stage's <a
-     href="#content3">content model flag</a> to <em>PLAINTEXT</em>.
+     href="#content4">content model flag</a> to <em>PLAINTEXT</em>.
 
    <li>
     <p>If <var title="">replace</var> is false, then:
@@ -10208,7 +10217,8 @@
    keywords defined for this attribute. The states given in the first cell of
    the rows with keywords give the states to which those keywords
    map.<!-- Some of the keywords are non-conforming, as
-  noted in the last column.-->
+  noted in the last column.--></p>
+  <!-- things that are neither conforming nor do anything are commented out -->
 
   <table>
    <thead>
@@ -10217,12 +10227,13 @@
 
      <th>Keywords <!--     <th>Notes-->
 
-   <tbody><!-- things that are neither conforming nor do anything are commented out
+   <tbody>
     <tr>
-     <td><span title="attr-meta-http-equiv-content-language">Content-Language</span>
+     <td><a href="#content3"
+      title=attr-meta-http-equiv-content-language>Content Language</a>
+
      <td><code title="">Content-Language</code>
-     <td>Non-conforming [ XXX but maybe we should make this an alternative to <html lang="">? ]
--->
+      <!--     <td>Non-conforming -->
 
     <tr>
      <td><a href="#encoding" title=attr-meta-http-equiv-content-type>Encoding
@@ -10299,6 +10310,62 @@
    algorithm appropriate for that state, as described in the following list:
 
   <dl>
+   <dt><dfn id=content3 title=attr-meta-http-equiv-content-language>Content
+    language</dfn>
+
+   <dd>
+    <p>This pragma sets the <dfn id=document-wide>document-wide default
+     language</dfn>. Until the pragma is successfully processed, there is no
+     <a href="#document-wide">document-wide default language</a>.</p>
+
+    <ol>
+     <li>
+      <p>If another <code><a href="#meta0">meta</a></code> element in the <a
+       href="#content3" title=attr-meta-http-equiv-content-language>Content
+       Language state</a> has already been successfully processed (i.e. when
+       it was inserted the user agent processed it and reached the last step
+       of this list of steps), then abort these steps.
+
+     <li>
+      <p>If the <code><a href="#meta0">meta</a></code> element has no <code
+       title=attr-meta-content><a href="#content1">content</a></code>
+       attribute, or if that attribute's value is the empty string, then
+       abort these steps.
+
+     <li>
+      <p>Let <var title="">input</var> be the value of the element's <code
+       title=attr-meta-content><a href="#content1">content</a></code>
+       attribute.
+
+     <li>
+      <p>Let <var title="">position</var> point at the first character of
+       <var title="">input</var>.
+
+     <li>
+      <p><a href="#skip-whitespace">Skip whitespace</a>.
+
+     <li>
+      <p><a href="#collect" title="collect a sequence of characters">Collect
+       a sequence of characters</a> that are neither <a href="#space"
+       title="space character">space characters</a> nor a U+002C COMMA
+       character (",").
+
+     <li>
+      <p>Let the <a href="#document-wide">document-wide default language</a>
+       be the string that resulted from the previous step.
+    </ol>
+
+    <p>For <code><a href="#meta0">meta</a></code> elements in the <a
+     href="#content3" title=attr-meta-http-equiv-content-language>Content
+     Language state</a>, the <code title=attr-meta-content><a
+     href="#content1">content</a></code> attribute must have a value
+     consisting of a valid RFC 3066 language code. <a
+     href="#references">[RFC3066]</a></p>
+
+    <p class=note>This pragma not exactly equivalent to the HTTP
+     <code>Content-Language</code> header, for instance it only supports one
+     language. <a href="#references">[RFC2616]</a></p>
+
    <dt><dfn id=encoding title=attr-meta-http-equiv-content-type>Encoding
     declaration state</dfn>
 
@@ -36440,7 +36507,7 @@
    title="HTML documents">HTML document</a>, create an <a href="#html-0">HTML
    parser</a>, associate it with the document, act as if the tokeniser had
    emitted a start tag token with the tag name "pre", set the <a
-   href="#tokenization0">tokenization</a> stage's <a href="#content3">content
+   href="#tokenization0">tokenization</a> stage's <a href="#content4">content
    model flag</a> to <i>PLAINTEXT</i>, and begin to pass the stream of
    characters in the plain text document to that tokeniser.
 
@@ -46632,7 +46699,7 @@
    to another state.
 
   <p>The exact behavior of certain states depends on a <dfn
-   id=content3>content model flag</dfn> that is set after certain tokens are
+   id=content4>content model flag</dfn> that is set after certain tokens are
    emitted. The flag has several states: <i title="">PCDATA</i>, <i
    title="">RCDATA</i>, <i title="">CDATA</i>, and <i title="">PLAINTEXT</i>.
    Initially it must be in the PCDATA state. In the RCDATA and CDATA states,
@@ -46656,7 +46723,7 @@
 
   <p>When a token is emitted, it must immediately be handled by the <a
    href="#tree-construction0">tree construction</a> stage. The tree
-   construction stage can affect the state of the <a href="#content3">content
+   construction stage can affect the state of the <a href="#content4">content
    model flag</a>, and can insert additional characters into the stream. (For
    example, the <code><a href="#script1">script</a></code> element can result
    in scripts executing and using the <a href="#dynamic3">dynamic markup
@@ -46667,7 +46734,7 @@
    flag">acknowledged</dfn> when it is processed by the tree construction
    stage, that is a <a href="#parse2">parse error</a>.
 
-  <p>When an end tag token is emitted, the <a href="#content3">content model
+  <p>When an end tag token is emitted, the <a href="#content4">content model
    flag</a> must be switched to the PCDATA state.
 
   <p>When an end tag token is emitted with attributes, that is a <a
@@ -46698,7 +46765,7 @@
   <dl class=switch>
    <dt>U+0026 AMPERSAND (&amp;)
 
-   <dd>When the <a href="#content3">content model flag</a> is set to one of
+   <dd>When the <a href="#content4">content model flag</a> is set to one of
     the PCDATA or RCDATA states and the <a href="#escape">escape flag</a> is
     false: switch to the <a href="#character6">character reference data
     state</a>.
@@ -46708,7 +46775,7 @@
    <dt>U+002D HYPHEN-MINUS (-)
 
    <dd>
-    <p>If the <a href="#content3">content model flag</a> is set to either the
+    <p>If the <a href="#content4">content model flag</a> is set to either the
      RCDATA state or the CDATA state, and the <a href="#escape">escape
      flag</a> is false, and there are at least three characters before this
      one in the input stream, and the last four characters in the input
@@ -46721,10 +46788,10 @@
 
    <dt>U+003C LESS-THAN SIGN (&lt;)
 
-   <dd>When the <a href="#content3">content model flag</a> is set to the
+   <dd>When the <a href="#content4">content model flag</a> is set to the
     PCDATA state: switch to the <a href="#tag-open0">tag open state</a>.
 
-   <dd>When the <a href="#content3">content model flag</a> is set to either
+   <dd>When the <a href="#content4">content model flag</a> is set to either
     the RCDATA state or the CDATA state and the <a href="#escape">escape
     flag</a> is false: switch to the <a href="#tag-open0">tag open state</a>.
 
@@ -46733,7 +46800,7 @@
    <dt>U+003E GREATER-THAN SIGN (&gt;)
 
    <dd>
-    <p>If the <a href="#content3">content model flag</a> is set to either the
+    <p>If the <a href="#content4">content model flag</a> is set to either the
      RCDATA state or the CDATA state, and the <a href="#escape">escape
      flag</a> is true, and the last three characters in the input stream
      including this one are U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E
@@ -46760,7 +46827,7 @@
   <h5 id=character1><span class=secno>8.2.4.2. </span><dfn
    id=character6>Character reference data state</dfn></h5>
 
-  <p><em>(This cannot happen if the <a href="#content3">content model
+  <p><em>(This cannot happen if the <a href="#content4">content model
    flag</a> is set to the CDATA state.)</em>
 
   <p>Attempt to <a href="#consume">consume a character reference</a>, with no
@@ -46775,11 +46842,11 @@
   <h5 id=tag-open><span class=secno>8.2.4.3. </span><dfn id=tag-open0>Tag
    open state</dfn></h5>
 
-  <p>The behavior of this state depends on the <a href="#content3">content
+  <p>The behavior of this state depends on the <a href="#content4">content
    model flag</a>.
 
   <dl>
-   <dt>If the <a href="#content3">content model flag</a> is set to the RCDATA
+   <dt>If the <a href="#content4">content model flag</a> is set to the RCDATA
     or CDATA states
 
    <dd>
@@ -46789,7 +46856,7 @@
      and reconsume the current input character in the <a
      href="#data-state0">data state</a>.</p>
 
-   <dt>If the <a href="#content3">content model flag</a> is set to the PCDATA
+   <dt>If the <a href="#content4">content model flag</a> is set to the PCDATA
     state
 
    <dd>
@@ -46842,10 +46909,10 @@
   <h5 id=close><span class=secno>8.2.4.4. </span><dfn id=close4>Close tag
    open state</dfn></h5>
 
-  <p>If the <a href="#content3">content model flag</a> is set to the RCDATA
+  <p>If the <a href="#content4">content model flag</a> is set to the RCDATA
    or CDATA states but no start tag token has ever been emitted by this
    instance of the tokeniser (<a href="#fragment">fragment case</a>), or, if
-   the <a href="#content3">content model flag</a> is set to the RCDATA or
+   the <a href="#content4">content model flag</a> is set to the RCDATA or
    CDATA states and the next few characters do not match the tag name of the
    last start tag token emitted (compared in an <span>ASCII case
    insensitive</span> manner), or if they do but they are not immediately
@@ -46872,7 +46939,7 @@
    character token, and switch to the <a href="#data-state0">data state</a>
    to process the <a href="#next-input">next input character</a>.
 
-  <p>Otherwise, if the <a href="#content3">content model flag</a> is set to
+  <p>Otherwise, if the <a href="#content4">content model flag</a> is set to
    the PCDATA state, or if the next few characters <em>do</em> match that tag
    name, consume the <a href="#next-input">next input character</a>:
 
@@ -47354,7 +47421,7 @@
   <h5 id=bogus><span class=secno>8.2.4.16. </span><dfn id=bogus1>Bogus
    comment state</dfn></h5>
 
-  <p><em>(This can only happen if the <a href="#content3">content model
+  <p><em>(This can only happen if the <a href="#content4">content model
    flag</a> is set to the PCDATA state.)</em>
 
   <p>Consume every character up to and including the first U+003E
@@ -47373,7 +47440,7 @@
   <h5 id=markup><span class=secno>8.2.4.17. </span><dfn id=markup0>Markup
    declaration open state</dfn></h5>
 
-  <p><em>(This can only happen if the <a href="#content3">content model
+  <p><em>(This can only happen if the <a href="#content4">content model
    flag</a> is set to the PCDATA state.)</em>
 
   <p>If the next two characters are both U+002D HYPHEN-MINUS (-) characters,
@@ -47393,7 +47460,7 @@
    (the five uppercase letters "CDATA" with a U+005B LEFT SQUARE BRACKET
    character before and after), then consume those characters and switch to
    the <a href="#cdata2">CDATA section state</a> (which is unrelated to the
-   <a href="#content3">content model flag</a>'s CDATA state).
+   <a href="#content4">content model flag</a>'s CDATA state).
 
   <p>Otherwise, this is a <a href="#parse2">parse error</a>. Switch to the <a
    href="#bogus1">bogus comment state</a>. The next character that is
@@ -48003,9 +48070,9 @@
   <h5 id=cdata0><span class=secno>8.2.4.36. </span><dfn id=cdata2>CDATA
    section state</dfn></h5>
 
-  <p><em>(This can only happen if the <a href="#content3">content model
+  <p><em>(This can only happen if the <a href="#content4">content model
    flag</a> is set to the PCDATA state, and is unrelated to the <a
-   href="#content3">content model flag</a>'s CDATA state.)</em>
+   href="#content4">content model flag</a>'s CDATA state.)</em>
 
   <p>Consume every character up to the next occurrence of the three character
    sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE BRACKET U+003E
@@ -48718,10 +48785,10 @@
    <li>
     <p>If the algorithm that was invoked is the <a href="#generic">generic
      CDATA element parsing algorithm</a>, switch the tokeniser's <a
-     href="#content3">content model flag</a> to the CDATA state; otherwise
+     href="#content4">content model flag</a> to the CDATA state; otherwise
      the algorithm invoked was the <a href="#generic0">generic RCDATA element
      parsing algorithm</a>, switch the tokeniser's <a
-     href="#content3">content model flag</a> to the RCDATA state.
+     href="#content4">content model flag</a> to the RCDATA state.
 
    <li>
     <p>Then, collect all the character tokens that the tokeniser returns
@@ -48734,7 +48801,7 @@
      all those tokens' characters, to the new element node.
 
    <li>
-    <p>The tokeniser's <a href="#content3">content model flag</a> will have
+    <p>The tokeniser's <a href="#content4">content model flag</a> will have
      switched back to the PCDATA state.
 
    <li>
@@ -49366,7 +49433,7 @@
      script will execute in-line, instead of blowing the document away, as
      would happen in most other cases.</p>
 
-    <p>Switch the tokeniser's <a href="#content3">content model flag</a> to
+    <p>Switch the tokeniser's <a href="#content4">content model flag</a> to
      the CDATA state.</p>
 
     <p>Then, collect all the character tokens that the tokeniser returns
@@ -49378,7 +49445,7 @@
      href="#script1">script</a></code> element node whose contents is the
      concatenation of all those tokens' characters.</p>
 
-    <p>The tokeniser's <a href="#content3">content model flag</a> will have
+    <p>The tokeniser's <a href="#content4">content model flag</a> will have
      switched back to the PCDATA state.</p>
 
     <p>If the next token is not an end tag token with the tag name "script",
@@ -49949,13 +50016,13 @@
 
     <p><a href="#insert0">Insert an HTML element</a> for the token.</p>
 
-    <p>Switch the <a href="#content3">content model flag</a> to the PLAINTEXT
+    <p>Switch the <a href="#content4">content model flag</a> to the PLAINTEXT
      state.</p>
 
     <p class=note>Once a start tag with the tag name "plaintext" has been
      seen, that will be the last token ever seen other than character tokens
      (and the end-of-file token), because there is no way to switch the <a
-     href="#content3">content model flag</a> out of the PLAINTEXT state.</p>
+     href="#content4">content model flag</a> out of the PLAINTEXT state.</p>
    </dd>
    <!-- end tags for non-phrasing flow content elements -->
    <!-- the normal ones -->
@@ -50584,7 +50651,7 @@
      <code>form</code> element pointed to by the <a
      href="#form-element"><code title="">form</code> element pointer</a>.</p>
 
-    <p>Switch the tokeniser's <a href="#content3">content model flag</a> to
+    <p>Switch the tokeniser's <a href="#content4">content model flag</a> to
      the RCDATA state.</p>
 
     <p>If the next token is a U+000A LINE FEED (LF) character token, then
@@ -50599,7 +50666,7 @@
      single <code>Text</code> node, whose contents is the concatenation of
      all those tokens' characters, to the new element node.</p>
 
-    <p>The tokeniser's <a href="#content3">content model flag</a> will have
+    <p>The tokeniser's <a href="#content4">content model flag</a> will have
      switched back to the PCDATA state.</p>
 
     <p>If the next token is an end tag token with the tag name "textarea",
@@ -52512,14 +52579,14 @@
    <li>
     <p>Set the <a href="#html-0">HTML parser</a>'s <a
      href="#tokenization0">tokenization</a> stage's <a
-     href="#content3">content model flag</a> according to the <var
+     href="#content4">content model flag</a> according to the <var
      title="">context</var> element, as follows:</p>
 
     <dl class=switch>
      <dt>If it is a <code><a href="#title1">title</a></code> or
       <code>textarea</code> element
 
-     <dd>Set the <a href="#content3">content model flag</a> to the RCDATA
+     <dd>Set the <a href="#content4">content model flag</a> to the RCDATA
       state.
 
      <dt>If it is a <code><a href="#style1">style</a></code>, <code><a
@@ -52527,23 +52594,23 @@
       href="#iframe">iframe</a></code>, <code>noembed</code>, or
       <code>noframes</code> element
 
-     <dd>Set the <a href="#content3">content model flag</a> to the CDATA
+     <dd>Set the <a href="#content4">content model flag</a> to the CDATA
       state.
 
      <dt>If it is a <code><a href="#noscript">noscript</a></code> element
 
      <dd>If the <a href="#scripting3">scripting flag</a> is enabled, set the
-      <a href="#content3">content model flag</a> to the CDATA state.
-      Otherwise, set the <a href="#content3">content model flag</a> to the
+      <a href="#content4">content model flag</a> to the CDATA state.
+      Otherwise, set the <a href="#content4">content model flag</a> to the
       PCDATA state.
 
      <dt>If it is a <code>plaintext</code> element
 
-     <dd>Set the <a href="#content3">content model flag</a> to PLAINTEXT.
+     <dd>Set the <a href="#content4">content model flag</a> to PLAINTEXT.
 
      <dt>Otherwise
 
-     <dd>Set the <a href="#content3">content model flag</a> to the PCDATA
+     <dd>Set the <a href="#content4">content model flag</a> to the PCDATA
       state.
     </dl>
 

Received on Tuesday, 12 August 2008 10:02:47 UTC