W3C home > Mailing lists > Public > public-html-diffs@w3.org > February 2009

spec/Overview.html 1.2032 2861 Reword how we require that XML documents

From: poot <cvsmail@w3.org>
Date: Mon, 23 Feb 2009 22:01:20 +0900 (JST)
To: public-html-diffs@w3.org
Message-Id: <20090223130120.CE6552BC84@toro.w3.mag.keio.ac.jp>
Reword how we require that XML documents that use <meta charset> must
use UTF-8. Also require it in the first 512 bytes. (whatwg r2861)

content
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#attr-meta-content
4.2.5.5 Specifying the document's character encoding
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#charset
charset
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#attr-meta-charset
4.2.5.4 Other pragma directives
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#other-pragma-directives
character encoding declaration
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#character-encoding-declaration
The element containing the character encoding declaration must be serialised completely within the first 512 bytes of the document.
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#charset512
HTMLMetaElement
http://people.w3.org/mike/diffs/html5/spec/Overview.1.2032.html#htmlmetaelement

http://people.w3.org/mike/diffs/html5/spec/Overview.diff.html
http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.2031&r2=1.2032&f=h
http://html5.org/tools/web-apps-tracker?from=2860&to=2861

===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.2031
retrieving revision 1.2032
diff -u -d -r1.2031 -r1.2032
--- Overview.html	23 Feb 2009 12:26:53 -0000	1.2031
+++ Overview.html	23 Feb 2009 12:58:00 -0000	1.2032
@@ -7495,13 +7495,15 @@
   specified.<p>If either <code title=attr-meta-name><a href=#attr-meta-name>name</a></code> or <code title=attr-meta-http-equiv><a href=#attr-meta-http-equiv>http-equiv</a></code> is specified, then
   the <code title=attr-meta-content><a href=#attr-meta-content>content</a></code> attribute must
   also be specified. Otherwise, it must be omitted.<p>The <dfn id=attr-meta-charset title=attr-meta-charset><code>charset</code></dfn>
-  attribute specifies the character encoding used by the document. In
-  <a href=#html5 title=HTML5>HTML documents</a> this is a <a href=#character-encoding-declaration>character
-  encoding declaration</a>. If the attribute is present in an <a href=#xhtml5 title=XHTML>XML document</a>, its value must be an <a href=#ascii-case-insensitive>ASCII
-  case-insensitive</a> match for the string "<code title="">UTF-8</code>", and the resource must be encoded using the
-  UTF-8 character encoding. (The element has no effect in XML
-  documents, and is only allowed to facilitate migration to and from
-  XHTML.)<p>There must not be more than one <code><a href=#meta>meta</a></code> element with a
+  attribute specifies the character encoding used by the
+  document. This is a <a href=#character-encoding-declaration>character encoding declaration</a>. If
+  the attribute is present in an <a href=#xhtml5 title=XHTML>XML
+  document</a>, its value must be an <a href=#ascii-case-insensitive>ASCII
+  case-insensitive</a> match for the string "<code title="">UTF-8</code>" (and the document is therefore required to
+  use UTF-8 as its encoding).<p class=note>The <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code>
+  attribute on the <code><a href=#meta>meta</a></code> element has no effect in XML
+  documents, and is only allowed in order to facilitate migration to
+  and from XHTML.<p>There must not be more than one <code><a href=#meta>meta</a></code> element with a
   <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute per
   document.<p>The <dfn id=attr-meta-content title=attr-meta-content><code>content</code></dfn>
   attribute gives the value of the document metadata or pragma
@@ -7942,7 +7944,9 @@
   Wiki PragmaExtensions page to establish if a value not explicitly
   defined in this specification is allowed or not.<h5 id=charset><span class=secno>4.2.5.5 </span>Specifying the document's character encoding</h5><!-- XXX maybe the rest should move to "writing html" section,
   though if we do then we have to duplicate the requirements in the
-  parsing section for conformance checkers --><p>A <dfn id=character-encoding-declaration>character encoding declaration</dfn> is a mechanism by
+  parsing section for conformance checkers, and we have to make sure
+  that the requirements for charset="" apply even in XML, for the
+   polyglot hack --><p>A <dfn id=character-encoding-declaration>character encoding declaration</dfn> is a mechanism by
   which the character encoding used to store or transmit a document is
   specified.<p>The following restrictions apply to character encoding
   declarations:<ul><li>The character encoding name given must be the name of the
@@ -7960,14 +7964,16 @@
    declaration must be serialised completely within the first 512
    bytes of the document.</li>
 
-  </ul><p>If the document does not start with a BOM, and if its encoding is
-  not explicitly given by <a href=#content-type-0 title=Content-Type>Content-Type
-  metadata</a>, then the character encoding used must be an
-  <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>, and, in addition,
-  if that encoding isn't US-ASCII itself, then the encoding must be
-  specified using a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
+  </ul><p>If an <a href=#html-documents title="HTML documents">HTML document</a> does not
+  start with a BOM, and if its encoding is not explicitly given by
+  <a href=#content-type-0 title=Content-Type>Content-Type metadata</a>, then the
+  character encoding used must be an <a href=#ascii-compatible-character-encoding>ASCII-compatible character
+  encoding</a>, and, in addition, if that encoding isn't US-ASCII
+  itself, then the encoding must be specified using a
+  <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
   <code><a href=#meta>meta</a></code> element in the <a href=#attr-meta-http-equiv-content-type title=attr-meta-http-equiv-content-type>Encoding declaration
-  state</a>.<p>If the document contains a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
+  state</a>.<p>If an <a href=#html-documents title="HTML documents">HTML document</a> contains
+  a <code><a href=#meta>meta</a></code> element with a <code title=attr-meta-charset><a href=#attr-meta-charset>charset</a></code> attribute or a
   <code><a href=#meta>meta</a></code> element in the <a href=#attr-meta-http-equiv-content-type title=attr-meta-http-equiv-content-type>Encoding declaration
   state</a>, then the character encoding used must be an
   <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.<p>Authors should not use JIS_X0212-1990, x-JIS0208, and encodings
Received on Monday, 23 February 2009 13:16:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 18 December 2010 06:13:58 GMT