html-author/Overview.html 1.40 Updated description of the DOCTYPE synta

Updated description of the DOCTYPE syntax

3.6 Polyglot Documents
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#polyglot-documents
3.2.2 Elements
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#elements
4.1.7 Embedded content
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#embedded-content
3.2.1 DOCTYPE Declaration
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#doctype-declaration
3.2.3.1 Empty Attributes
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#empty-attr
3.2.3.4 Single-Quoted Attribute Values
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#single-quote-attr
3.5 Choosing HTML or XHTML
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#choosing-html-or-xhtml
3.2.3.3 Double-Quoted Attribute Values
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#double-quote-attr
2 Getting Started with HTML 5
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#getting-started-with-html-5
1 Introduction
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#introduction
2.1 A Basic Document
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#a-basic-document
3.2.3 Attributes
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#attributes
2.2 Understanding Semantics
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#understanding-semantics
4.1.8 Interactive content
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#interactive-content
3.1 Syntactic Overview
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#syntactic-overview
3.4 Character Encoding
http://people.w3.org/mike/diffs/html5/html-author/Overview.1.40.html#character-encoding

http://people.w3.org/mike/diffs/html5/html-author/Overview.diff.html
http://dev.w3.org/cvsweb/html5/html-author/Overview.html?r1=1.39&r2=1.40&f=h

===================================================================
RCS file: /sources/public/html5/html-author/Overview.html,v
retrieving revision 1.39
retrieving revision 1.40
diff -u -d -r1.39 -r1.40
--- Overview.html 5 Mar 2009 13:57:40 -0000 1.39
+++ Overview.html 5 Mar 2009 16:52:13 -0000 1.40
@@ -146,7 +146,8 @@
      <li><a href=#character-references><span class=secno>3.2.7 </span>Character References</a></li></ol></li>
    <li><a href=#understanding-mime-types><span class=secno>3.3 </span>Understanding MIME Types</a></li>
    <li><a href=#character-encoding><span class=secno>3.4 </span>Character Encoding</a></li>
-   <li><a href=#polyglot-documents><span class=secno>3.5 </span>Polyglot Documents</a></li></ol></li>
+   <li><a href=#choosing-html-or-xhtml><span class=secno>3.5 </span>Choosing HTML or XHTML</a></li>
+   <li><a href=#polyglot-documents><span class=secno>3.6 </span>Polyglot Documents</a></li></ol></li>
  <li><a href=#the-html-vocabulary-and-apis><span class=secno>4 </span>The HTML Vocabulary and APIs</a>
   <ol class=toc>
    <li><a href=#categories><span class=secno>4.1 </span>Categories</a>
@@ -340,7 +341,7 @@
   <p class=big-issue>The goal of this section is to walk people though creating
      <a href=examples/example01.html>example01.html</a></p>
 
-  <p>To begin, we're going to create a very basic HTML document, which
+  <p>To begin, we’re going to create a very basic HTML document, which
      will also serve as a useful template for future HTML documents.
      This document will simply contain a title and short paragraph.</p>
 
@@ -366,7 +367,7 @@
 
   <p>An HTML document is divided into two main sections. The head, which
      is used to contain document metadata, such as the title, stylesheets
-     and scripts; and the body, which contain all of the page's content.
+     and scripts; and the body, which contain all of the page’s content.
      The markup itself forms a tree structure, as illustrated in the
      following diagram.</p>
 
@@ -405,13 +406,13 @@
      element for paragraph, various list elements for marking up different
      types of lists, and a table elements for marking up tables.</p>
 
-  <p>It's important to distinguish between the structure and semantics of
+  <p>It’s important to distinguish between the structure and semantics of
      content, which should be described using HTML, and its presentation. In
      one document, a heading may be presented visually in a large bold
      typeface with wide margins above and below to separate it from the
      surrounding content and make it stand out.  In another document, a
      heading may be presented in a light coloured, italic, fancy script
-     typeface.  But regardless of the presentation, it's still a heading and
+     typeface.  But regardless of the presentation, it’s still a heading and
      the markup can still uses the same basic elements for identifying common
      structures.</p>
  </section>
@@ -460,7 +461,7 @@
      features based upon their own personal preferences.</p>
 
   <p>The following example illustrates a basic HTML document,
-     demonstrating a few of the shorthand syntax</p>
+     demonstrating some shorthand syntax:</p>
 
   <div class="example html">
    <p>HTML Example:</p>
@@ -478,8 +479,8 @@
 
   <p>XHTML, however, is based on the much more strict XML syntax.  While
      this too is inspired by SGML, this syntax requires documents to be
-     well-formed, which some people prefer because of it's stricter error handling,
-     forcing authors to maintain cleaner markup.</p>
+     well-formed, which some people prefer because of its stricter error
+     handling, forcing authors to maintain cleaner markup.</p>
 
   <div class="example xhtml">
    <p>XHTML Example:</p>
@@ -524,9 +525,12 @@
    <h4 id=doctype-declaration><span class=secno>3.2.1 </span>DOCTYPE Declaration</h4>
    <p>The Document Type Declaration needs to be present at the beginning of a
       document that uses the HTML syntax. It may optionally be used within the
-      XHTML syntax, but it is not required.</p>
+      XHTML syntax, but it is not required.  The canonical <code>DOCTYPE</code>
+      that most HTML documents should use is as follows:</p>
 
-   <pre><code>&lt;!DOCTYPE html&gt;</code></pre>
+   <div class=example>
+    <pre><code>&lt;!DOCTYPE html&gt;</code></pre>
+   </div>
 
    <p>The <code>DOCTYPE</code> originates from HTML’s SGML lineage and, in
       previous levels of HTML, was originally used to refer to a Document Type
@@ -545,9 +549,64 @@
       <strong>limited quirks mode</strong> and <strong>no quirks mode</strong>,
       of which only the latter is considered conforming to use. The reason for
       this is due to backwards compatibility. The important thing to understand
-      is that there are differences in the way documents are visually rendered
-      in each of the modes and to ensure the most standards compliant
+      is that there are some differences in the way documents are visually
+      rendered in each of the modes; and to ensure the most standards compliant
       rendering, it is important to ensure no-quirks mode is used.</p>
+
+   <p>For compatibility with legacy producers of HTML — that is, software that
+      outputs HTML documents — an alternative <code>DOCTYPE</code> is available
+      for use by systems which are unable to output the <code>DOCTYPE</code>
+      given above. This limitation occurs in software that expects a
+      <code>DOCTYPE</code> to include either a <code>PUBLIC</code> or
+      <code>SYSTEM</code> identifier, and is unable to omit them.
+      The canonical form of this <code>DOCTYPE</code> is as follows:</p>
+
+   <div class=example>
+    <pre><code>&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;</code></pre>
+   </div>
+
+   <p>This uses the <code>SYSTEM</code> identifier with a URL that intentionally
+      points to a non-existent DTD. The <code>about:</code> URI scheme is used for
+      this purpose specifically because it cannot be resolved to any specific DTD.</p>
+
+   <p class=note>Note: The term "legacy-compat" refers to compatibility with legacy
+      producers only.  In particular, it does not refer to compatibility with
+      legacy browsers, which, in practice, ignore SYSTEM identifiers and DTDs.</p>
+
+   <p>In HTML, the <code>DOCTYPE</code> is case insensitive, except for the quoted string
+      <code>"about:legacy-compat"</code>, which must be written in lower case.  The <code>SYSTEM</code>
+      identifier, however, may also be quoted with single quotes, rather than double quotes.
+      The following are all valid alternatives in the HTML syntax:</p>
+
+   <div class="html example">
+    <pre><code>&lt;!DOCTYPE html&gt;
+
+&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;
+
+&lt;!doctype html&gt;
+
+&lt;!DOCTYPE HTML&gt;
+
+&lt;!doctype html system 'about:legacy-compat'&gt;
+
+&lt;!Doctype HTML System "about:legacy-compat"&gt;</code></pre>
+   </div>
+   
+   <p>In XHTML, however, the DOCTYPE is case sensitive, and only the canonical
+      versions of the <code>DOCTYPE</code>s given above may be used.</p>
+
+   <div class="xhtml example">
+    <pre><code>&lt;!DOCTYPE html&gt;
+
+&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;</code></pre>
+   </div>
+
+   <p>However, there are no restrictions placed on the use of alternative
+      DOCTYPEs in XHTML. You may, if you wish, use a custom <code>DOCTYPE</code>
+      referring to a custom DTD, if you wish to use them for validation purposes.
+      Although, be advised that DTDs have a number of limitations compared
+      with other alternative schemas.</p>
+
   </section>
 
   <section>
@@ -564,7 +623,7 @@
    </div>
 
    <p>In both tags, whitespace is permitted between the tag name and the
-      closing right angle bracket, however it is usually omitted because it's
+      closing right angle bracket, however it is usually omitted because it’s
       redundant.</p>
 
    <p>In XHTML, tag names are <em>case sensitive</em> and are usually defined
@@ -670,7 +729,7 @@
       the HTML syntax, but not in the XHTML syntax.</p>
 
    <p class=note>Note: In previous editions of HTML, which were formally
-      based on SGML, it was technically an attribute's name that could be
+      based on SGML, it was technically an attribute’s name that could be
       omitted where the value was a unique enumerated value specified in the
       DTD. However, due to legacy constraints, this has been changed in HTML5
       to reflect the way implementations really work.</p>
@@ -724,7 +783,7 @@
       quotes.</p>
 
    <p>By quoting attributes, the value may contain the additional characters
-      that can't be used in unquoted attribute values, but for obvious reasons,
+      that can’t be used in unquoted attribute values, but for obvious reasons,
       these attributes cannot contain additional double quotation marks within
       the value.</p>
 
@@ -740,7 +799,7 @@
       quotes.</p>
 
    <p>By quoting attributes, the value may contain the additional characters
-      that can't be used in unquoted attribute values, but for obvious reasons,
+      that can’t be used in unquoted attribute values, but for obvious reasons,
       these attributes cannot contain additional single quotation marks within
       the value.</p>
 
@@ -785,7 +844,22 @@
  </section>
 
  <section>
-  <h3 id=polyglot-documents><span class=secno>3.5 </span>Polyglot Documents</h3>
+  <h3 id=choosing-html-or-xhtml><span class=secno>3.5 </span>Choosing HTML or XHTML</h3>
+  <p>The choice of HTML or XHTML syntax is largely dependent upon a number
+     of factors the, including needs of a given project, the skill set of
+     the developers involved, level of support in browsers used by the
+     site’s target audience, or it may simply be a matter of personal
+     preference.</p>
+
+  <p>The important thing to understand is that there are valid reasons to
+     choose both, and that authors are encouraged to make an informed
+     decision.</p>
+
+  <p class=issue>Need to develop guidelines to help authors make this choice.</p>
+ </section>
+
+ <section>
+  <h3 id=polyglot-documents><span class=secno>3.6 </span>Polyglot Documents</h3>
 
   <p>A polyglot HTML document is a document that conforms to both the
      HTML and XHTML syntactic requirements, and which can be processed
@@ -936,7 +1010,7 @@
 
   <h4 id=interactive-content><span class=secno>4.1.8 </span><dfn>Interactive content</dfn></h4>
   <p>Interactive elements are those that allow the user to interact with or
-     activate in some way.  Depending on the user's browser and device, this
+     activate in some way.  Depending on the user’s browser and device, this
      could be performed using any kind of input device, such as, for example,
      a mouse, keyboard, touch screen or voice input.</p>
 

Index: Overview.src.html
===================================================================
RCS file: /sources/public/html5/html-author/Overview.src.html,v
retrieving revision 1.40
retrieving revision 1.41
diff -u -d -r1.40 -r1.41
--- Overview.src.html 5 Mar 2009 13:57:40 -0000 1.40
+++ Overview.src.html 5 Mar 2009 16:52:13 -0000 1.41
@@ -187,7 +187,7 @@
   <p class="big-issue">The goal of this section is to walk people though creating
      <a href="examples/example01.html">example01.html</a></p>
 
-  <p>To begin, we're going to create a very basic HTML document, which
+  <p>To begin, we’re going to create a very basic HTML document, which
      will also serve as a useful template for future HTML documents.
      This document will simply contain a title and short paragraph.</p>
 
@@ -213,7 +213,7 @@
 
   <p>An HTML document is divided into two main sections. The head, which
      is used to contain document metadata, such as the title, stylesheets
-     and scripts; and the body, which contain all of the page's content.
+     and scripts; and the body, which contain all of the page’s content.
      The markup itself forms a tree structure, as illustrated in the
      following diagram.</p>
 
@@ -252,13 +252,13 @@
      element for paragraph, various list elements for marking up different
      types of lists, and a table elements for marking up tables.</p>
 
-  <p>It's important to distinguish between the structure and semantics of
+  <p>It’s important to distinguish between the structure and semantics of
      content, which should be described using HTML, and its presentation. In
      one document, a heading may be presented visually in a large bold
      typeface with wide margins above and below to separate it from the
      surrounding content and make it stand out.  In another document, a
      heading may be presented in a light coloured, italic, fancy script
-     typeface.  But regardless of the presentation, it's still a heading and
+     typeface.  But regardless of the presentation, it’s still a heading and
      the markup can still uses the same basic elements for identifying common
      structures.</p>
  </section>
@@ -307,7 +307,7 @@
      features based upon their own personal preferences.</p>
 
   <p>The following example illustrates a basic HTML document,
-     demonstrating a few of the shorthand syntax</p>
+     demonstrating some shorthand syntax:</p>
 
   <div class="example html">
    <p>HTML Example:</p>
@@ -325,8 +325,8 @@
 
   <p>XHTML, however, is based on the much more strict XML syntax.  While
      this too is inspired by SGML, this syntax requires documents to be
-     well-formed, which some people prefer because of it's stricter error handling,
-     forcing authors to maintain cleaner markup.</p>
+     well-formed, which some people prefer because of its stricter error
+     handling, forcing authors to maintain cleaner markup.</p>
 
   <div class="example xhtml">
    <p>XHTML Example:</p>
@@ -371,9 +371,12 @@
    <h1>DOCTYPE Declaration</h1>
    <p>The Document Type Declaration needs to be present at the beginning of a
       document that uses the HTML syntax. It may optionally be used within the
-      XHTML syntax, but it is not required.</p>
+      XHTML syntax, but it is not required.  The canonical <code>DOCTYPE</code>
+      that most HTML documents should use is as follows:</p>
 
-   <pre><code>&lt;!DOCTYPE html&gt;</code></pre>
+   <div class="example">
+    <pre><code>&lt;!DOCTYPE html&gt;</code></pre>
+   </div>
 
    <p>The <code>DOCTYPE</code> originates from HTML’s SGML lineage and, in
       previous levels of HTML, was originally used to refer to a Document Type
@@ -392,9 +395,64 @@
       <strong>limited quirks mode</strong> and <strong>no quirks mode</strong>,
       of which only the latter is considered conforming to use. The reason for
       this is due to backwards compatibility. The important thing to understand
-      is that there are differences in the way documents are visually rendered
-      in each of the modes and to ensure the most standards compliant
+      is that there are some differences in the way documents are visually
+      rendered in each of the modes; and to ensure the most standards compliant
       rendering, it is important to ensure no-quirks mode is used.</p>
+
+   <p>For compatibility with legacy producers of HTML — that is, software that
+      outputs HTML documents — an alternative <code>DOCTYPE</code> is available
+      for use by systems which are unable to output the <code>DOCTYPE</code>
+      given above. This limitation occurs in software that expects a
+      <code>DOCTYPE</code> to include either a <code>PUBLIC</code> or
+      <code>SYSTEM</code> identifier, and is unable to omit them.
+      The canonical form of this <code>DOCTYPE</code> is as follows:</p>
+
+   <div class="example">
+    <pre><code>&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;</code></pre>
+   </div>
+
+   <p>This uses the <code>SYSTEM</code> identifier with a URL that intentionally
+      points to a non-existent DTD. The <code>about:</code> URI scheme is used for
+      this purpose specifically because it cannot be resolved to any specific DTD.</p>
+
+   <p class="note">Note: The term "legacy-compat" refers to compatibility with legacy
+      producers only.  In particular, it does not refer to compatibility with
+      legacy browsers, which, in practice, ignore SYSTEM identifiers and DTDs.</p>
+
+   <p>In HTML, the <code>DOCTYPE</code> is case insensitive, except for the quoted string
+      <code>"about:legacy-compat"</code>, which must be written in lower case.  The <code>SYSTEM</code>
+      identifier, however, may also be quoted with single quotes, rather than double quotes.
+      The following are all valid alternatives in the HTML syntax:</p>
+
+   <div class="html example">
+    <pre><code>&lt;!DOCTYPE html&gt;
+
+&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;
+
+&lt;!doctype html&gt;
+
+&lt;!DOCTYPE HTML&gt;
+
+&lt;!doctype html system 'about:legacy-compat'&gt;
+
+&lt;!Doctype HTML System "about:legacy-compat"&gt;</code></pre>
+   </div>
+   
+   <p>In XHTML, however, the DOCTYPE is case sensitive, and only the canonical
+      versions of the <code>DOCTYPE</code>s given above may be used.</p>
+
+   <div class="xhtml example">
+    <pre><code>&lt;!DOCTYPE html&gt;
+
+&lt;!DOCTYPE html SYSTEM "about:legacy-compat"&gt;</code></pre>
+   </div>
+
+   <p>However, there are no restrictions placed on the use of alternative
+      DOCTYPEs in XHTML. You may, if you wish, use a custom <code>DOCTYPE</code>
+      referring to a custom DTD, if you wish to use them for validation purposes.
+      Although, be advised that DTDs have a number of limitations compared
+      with other alternative schemas.</p>
+
   </section>
 
   <section>
@@ -411,7 +469,7 @@
    </div>
 
    <p>In both tags, whitespace is permitted between the tag name and the
-      closing right angle bracket, however it is usually omitted because it's
+      closing right angle bracket, however it is usually omitted because it’s
       redundant.</p>
 
    <p>In XHTML, tag names are <em>case sensitive</em> and are usually defined
@@ -518,7 +576,7 @@
       the HTML syntax, but not in the XHTML syntax.</p>
 
    <p class="note">Note: In previous editions of HTML, which were formally
-      based on SGML, it was technically an attribute's name that could be
+      based on SGML, it was technically an attribute’s name that could be
       omitted where the value was a unique enumerated value specified in the
       DTD. However, due to legacy constraints, this has been changed in HTML5
       to reflect the way implementations really work.</p>
@@ -572,7 +630,7 @@
       quotes.</p>
 
    <p>By quoting attributes, the value may contain the additional characters
-      that can't be used in unquoted attribute values, but for obvious reasons,
+      that can’t be used in unquoted attribute values, but for obvious reasons,
       these attributes cannot contain additional double quotation marks within
       the value.</p>
 
@@ -588,7 +646,7 @@
       quotes.</p>
 
    <p>By quoting attributes, the value may contain the additional characters
-      that can't be used in unquoted attribute values, but for obvious reasons,
+      that can’t be used in unquoted attribute values, but for obvious reasons,
       these attributes cannot contain additional single quotation marks within
       the value.</p>
 
@@ -633,6 +691,21 @@
  </section>
 
  <section>
+  <h1>Choosing HTML or XHTML</h1>
+  <p>The choice of HTML or XHTML syntax is largely dependent upon a number
+     of factors the, including needs of a given project, the skill set of
+     the developers involved, level of support in browsers used by the
+     site’s target audience, or it may simply be a matter of personal
+     preference.</p>
+
+  <p>The important thing to understand is that there are valid reasons to
+     choose both, and that authors are encouraged to make an informed
+     decision.</p>
+
+  <p class="issue">Need to develop guidelines to help authors make this choice.</p>
+ </section>
+
+ <section>
   <h1>Polyglot Documents</h1>
 
   <p>A polyglot HTML document is a document that conforms to both the
@@ -784,7 +857,7 @@
 
   <h2><dfn>Interactive content</dfn></h2>
   <p>Interactive elements are those that allow the user to interact with or
-     activate in some way.  Depending on the user's browser and device, this
+     activate in some way.  Depending on the user’s browser and device, this
      could be performed using any kind of input device, such as, for example,
      a mouse, keyboard, touch screen or voice input.</p>

Received on Sunday, 8 March 2009 13:38:31 UTC