html5/html-author Overview.html,1.38,1.39 Overview.src.html,1.39,1.40 from Lachlan Hunt via cvs-syncmail on 2009-03-05 (public-html-commits@w3.org from March 2009)

From: Lachlan Hunt via cvs-syncmail <cvsmail@w3.org>
Date: Thu, 05 Mar 2009 13:57:43 +0000
To: public-html-commits@w3.org
Message-Id: <E1LfE51-0002B7-2C@lionel-hutz.w3.org>
Update of /sources/public/html5/html-author
In directory hutz:/tmp/cvs-serv8357

Modified Files:
	Overview.html Overview.src.html 
Log Message:
Expanded the syntax section

Index: Overview.html
===================================================================
RCS file: /sources/public/html5/html-author/Overview.html,v
retrieving revision 1.38
retrieving revision 1.39
diff -u -d -r1.38 -r1.39
--- Overview.html	5 Mar 2009 13:00:27 -0000	1.38
+++ Overview.html	5 Mar 2009 13:57:40 -0000	1.39
@@ -129,7 +129,24 @@
    <li><a href=#understanding-semantics><span class=secno>2.2 </span>Understanding Semantics</a></li></ol></li>
  <li><a href=#the-html-and-xhtml-syntax><span class=secno>3 </span>The HTML and XHTML Syntax</a>
   <ol class=toc>
-   <li><a href=#syntactic-overview><span class=secno>3.1 </span>Syntactic Overview</a></li></ol></li>
+   <li><a href=#syntactic-overview><span class=secno>3.1 </span>Syntactic Overview</a></li>
+   <li><a href=#the-syntax><span class=secno>3.2 </span>The Syntax</a>
+    <ol class=toc>
+     <li><a href=#doctype-declaration><span class=secno>3.2.1 </span>DOCTYPE Declaration</a></li>
+     <li><a href=#elements><span class=secno>3.2.2 </span>Elements</a></li>
+     <li><a href=#attributes><span class=secno>3.2.3 </span>Attributes</a>
+      <ol class=toc>
[...1174 lines suppressed...]
 					<div class=dom>
 						<h5 class="no-num no-toc" id=dom-interface-92>DOM Interface</h5>
@@ -4441,7 +4758,7 @@
 
 				<div class=properties>
 					<div class=attributes>
-						<h5 class="no-num no-toc" id=attributes-93>Attributes</h5>
+						<h5 class="no-num no-toc" id=attributes-94>Attributes</h5>
 					<ul><li><span>Global attributes</span></li></ul></div>
 					<div class=dom>
 						<h5 class="no-num no-toc" id=dom-interface-93>DOM Interface</h5>
@@ -5308,7 +5625,7 @@
 
 
 			<section>
-				<h5 id=attributes-94><span class=secno>6.1.2.1 </span>Attributes</h5>
+				<h5 id=attributes-95><span class=secno>6.1.2.1 </span>Attributes</h5>
 				<p>Unless explicitly stated otherwise for a specific purpose, all attribute
 				   values in examples are quoted using double quotes. In HTML examples,
 				   boolean attributes are written in their minimised form and in XHTML

Index: Overview.src.html
===================================================================
RCS file: /sources/public/html5/html-author/Overview.src.html,v
retrieving revision 1.39
retrieving revision 1.40
diff -u -d -r1.39 -r1.40
--- Overview.src.html	5 Mar 2009 13:00:28 -0000	1.39
+++ Overview.src.html	5 Mar 2009 13:57:40 -0000	1.40
@@ -359,6 +359,307 @@
 		   familiarise themselves with the similarities and differences between
 		   HTML and XHTML.</p>
 	</section>
+
+	<section>
+		<h1>The Syntax</h1>
+		<p>There are a number of basic components make up the syntax of HTML,
+		   that are used throughout any document.  These include the
+		   <code>DOCTYPE</code> declaration, elements, attributes, comments,
+		   text and CDATA sections.</p>
+
+		<section>
+			<h1>DOCTYPE Declaration</h1>
+			<p>The Document Type Declaration needs to be present at the beginning of a
+			   document that uses the HTML syntax. It may optionally be used within the
+			   XHTML syntax, but it is not required.</p>
+
+			<pre><code>&lt;!DOCTYPE html&gt;</code></pre>
+
+			<p>The <code>DOCTYPE</code> originates from HTML’s SGML lineage and, in
+			   previous levels of HTML, was originally used to refer to a Document Type
+			   Definition (DTD) — a formal declaration of the elements, attributes and
+			   syntactic features that could be used within the document. Those who are
+			   familiar with previous levels of HTML will notice that there is no
+			   <code>PUBLIC</code> or <code>SYSTEM</code> identifier present in this
+			   <code>DOCTYPE</code>, which were used to refer to the DTD.</p>
+
+			<p>As HTML5 is no longer formally based upon SGML, the <code>DOCTYPE</code>
+			   no longer serves this purpose, and thus it does not refer to a DTD
+			   anymore. However, due to legacy constraints, it has gained another very
+			   important purpose: triggering no-quirks mode in browsers.</p>
+
+			<p>HTML 5 defines three modes: <strong>quirks mode</strong>,
+			   <strong>limited quirks mode</strong> and <strong>no quirks mode</strong>,
+			   of which only the latter is considered conforming to use. The reason for
+			   this is due to backwards compatibility. The important thing to understand
+			   is that there are differences in the way documents are visually rendered
+			   in each of the modes and to ensure the most standards compliant
+			   rendering, it is important to ensure no-quirks mode is used.</p>
+		</section>
+
+		<section>
+			<h1>Elements</h1>
+			<p>Elements are marked up using start tags and end tags. Tags are delimited
+			   using angle brackets with the tag name in between. The difference between
+			   start tags and end tags is that the latter includes a slash before the
+			   tag name.</p>
+
+			<div class="example">
+				<p>Example:</p>
+				<p>This example paragraph illustrates the use of start tags and end tags.</p>
+				<pre><code>&lt;p&gt;The quick brown fox jumps over the lazy dog.&lt;/p&gt;</code></pre>
+			</div>
+
+			<p>In both tags, whitespace is permitted between the tag name and the
+			   closing right angle bracket, however it is usually omitted because it's
+			   redundant.</p>
+
+			<p>In XHTML, tag names are <em>case sensitive</em> and are usually defined
+			   to be written in lowercase. In HTML, however, tag names are case
+			   insensitive and may be written in all uppercase or mixed case, although
+			   the most common convention is to stick with lowercase. The case
+			   of the start and end tags do not have to be the same, but being
+			   consistent does make the code look cleaner.</p>
+
+			<div class="html example">
+				<p>HTML Example:</p>
+				<pre><code>&lt;DIV&gt;...&lt;/DIV&gt;</code></pre>
+			</div>
+
+			<p>An empty element is any element that does not contain any content within
+			   it. In general, an empty element is just one with a start tag immediately
+			   followed by its associated end tag. In both HTML and XHTML syntaxes, this
+			   can be represented in the same way.</p>
+
+			<div class="example">
+				<p>Example:</p
+				<pre><code>&lt;span&gt;&lt;/span&gt;</code></pre>
+			</div>
+
+			<p>Some elements, however, are forbidden from containing any content at all.
+			   These are known as <em>void elements</em>. In HTML, the above syntax
+			   cannot be used for void elements. For such elements, the end tag must be
+			   omitted because the element is automatically closed by the parser. Such
+			   elements include, among others, <code>br</code>, <code>hr</code>,
+			   <code>link</code> and <code>meta</code></p>
+
+			<div class="example html">
+				<p>HTML Example:</p>
+				<pre><code>&lt;link type="text/css" rel="stylesheet" href="style.css"&gt;</code></pre>
+			</div>
+
+			<p>In XHTML, the XML syntactic requirements dictate that this must be made
+			   explicit using either an explicit end tag, as above, or the empty element
+			   syntax. This is achieved by inserting a slash at the end of the start tag
+			   immediately before the right angle bracket.</p>
+
+			<div class="example">
+				<p>Example:</p>
+				<pre><code>&lt;link type="text/css" href="style.css"/&gt;</code></pre>
+			</div>
+
+			<p>Authors may optionally choose to use this same syntax for void elements
+			   in the HTML syntax as well. Some authors also choose to include
+			   whitespace before the slash, however this is not necessary. (Using
+			   whitespace in that fashion is a convention inherited from the
+			   compatibility guidelines in XHTML 1.0, Appendix C.)
+		</section>
+
+		<section>
+			<h1>Attributes</h1>
+
+			<p>Elements may contain attributes that are used to set various properties
+			   of an element. Some attributes are defined globally and can be used on
+			   any element, while others are defined for specific elements only. All
+			   attributes have a name and a value and look like this.</p>
+
+			<div class="example">
+				<p>Example:</p>
+				<p>This example illustrates how to mark up a <code>div</code> element
+				   with an attribute named <code>class</code> using a value of
+				   <code>"example"</code>.</p>
+				<pre><code>&lt;div class="example"&gt;...&lt;/div&gt;</code></pre>
+			</div>
+
+			<p>Attributes may only be specified within start tags and must never be used
+			   in end tags.</p>
+
+			<div class="example error">
+				<p>Erroneous Example:</p>
+				<pre><code>&lt;section id="example"&gt;...&lt;/section <strong>id="example"</strong>&gt;</code></pre>
+			</div>
+
+			<p>In XHTML, attribute names are case sensitive and most are defined to be
+			   lowercase. In HTML, attribute names are case insensitive, and so they
+			   could be written in all uppercase or mixed case, depending on your own
+			   preferences. It is conventional, however, to use the same case as would
+			   be used in XHTML, which is generally all lowercase.</p>
+
+			<div class="html example">
+				<p>HTML Example:</p>
+				<pre><code>&lt;div CLASS="example"&gt;</code></pre>
+			</div>
+
+			<p>In general, the values of attributes can contain any text or
+			   character references, although depending on the syntax used, some
+			   additional restrictions apply, which are outlined below.</p>
+
+			<p>There are four slightly different syntaxes that may be used for
+			   attributes in HTML: empty, unquoted, single-quoted and double-quoted. All
+			   four syntaxes may be used in the HTML syntax, depending on what is needed
+			   for each specific attribute.  However, in the XHTML syntax, attribute
+			   values must always be quoted using either single or double quotes.</p>
+
+			<h2 id="empty-attr">Empty Attributes</h2>
+
+			<p>An empty attribute is one where the value has been omitted. This is a
+			   syntactic shorthand for specifying the attribute with an empty value,
+			   and is commonly used for boolean attributes. This syntax may be used in
+			   the HTML syntax, but not in the XHTML syntax.</p>
+
+			<p class="note">Note: In previous editions of HTML, which were formally
+			   based on SGML, it was technically an attribute's name that could be
+			   omitted where the value was a unique enumerated value specified in the
+			   DTD. However, due to legacy constraints, this has been changed in HTML5
+			   to reflect the way implementations really work.</p>
+
+			<div class="html example">
+				<p>HTML Example:</p>
+				<pre><code>&lt;input disabled&gt;...&lt;/div&gt;</code></pre>
+
+				<p>The previous example is equivalent to specifying the attribute with
+				   an empty string as the value.</p>
+				<pre><code>&lt;input disabled=""&gt;...&lt;/div&gt;</code></pre>		
+			</div>
+
+			<p class="note">Note: The previous example is semantically equivalent to
+			   specifying the attribute with the value <code>"disabled"</code>, but it
+			   is not exactly the same.</p>
+
+			<div class="html example">
+				<p>Example:</p>
+				<pre><code>&lt;img src="decoration.png" alt&gt;</code></pre>
+
+				<p>The previous example is equivalent to specifying the attribute with
+				   an empty string as the value.</p>
+				<pre><code>&lt;img src="decoration.png" alt=""&gt;</code></pre>
+			</div>
+
+			<h2 id="unquoted-attr">Unquoted Attribute Values</h2>
+
+			<p>In HTML, but not in XHTML, the quotes surrounding the value may also be
+			   omitted in most cases. The value may contain any characters except for
+			   spaces, single or double quotes (<code>'</code> or <code>"</code>), an
+			   equals sign (<code>=</code>) or a greater-than symbol
+			   (<code>&gt;</code>). If you need an attribute to contain those
+			   characters, they either need to be escaped using character references, or
+			   you need to use either the <span title="single-quote-attr">single-</span>
+			   or <span title="double-quote-attr">double-quoted attribute values</span>.</p>
+
+			<p>Some additional characters cannot be used in unquoted attribute values,
+			   including space characters, single (<code>'</code>) or double
+			   (<code>"</code>) quotation marks, equals signs (<code>=</code>) or
+			   greater than signs  (<code>&gt;</code>).</p>
+
+			<div class="html example">
+				<p>HTML Example:</p>
+				<pre><code>&lt;div class=example&gt;</code></pre>
+			</div>
+
+			<h2 id="double-quote-attr">Double-Quoted Attribute Values</h2>
+
+			<p>In both HTML and XHTML, attribute values may be surrounded with double
+			   quotes.</p>
+
+			<p>By quoting attributes, the value may contain the additional characters
+			   that can't be used in unquoted attribute values, but for obvious reasons,
+			   these attributes cannot contain additional double quotation marks within
+			   the value.</p>
+
+			<div class="example">
+				<p>Example:</p>
+				<pre><code>&lt;div class="example class names"&gt;...&lt;/div&gt;</code></pre>
+			</div>
+
+
+			<h2 id="single-quote-attr">Single-Quoted Attribute Values</h2>
+
+			<p>In both HTML and XHTML, attribute values may be surrounded with double
+			   quotes.</p>
+
+			<p>By quoting attributes, the value may contain the additional characters
+			   that can't be used in unquoted attribute values, but for obvious reasons,
+			   these attributes cannot contain additional single quotation marks within
+			   the value.</p>
+
+			<div class="example">
+				<p>Example:</p>
+				<pre><code>&lt;div class='example class names'&gt;...&lt;/div&gt;</code></pre>
+			</div>
+		</section>
+
+		<section>
+			<h1>Comments</h1>
+			<p class="issue">...</p>
+		</section>
+
+		<section>
+			<h1>Text</h1>
+			<p class="issue">...</p>
+		</section>
+
+		<section>
+			<h1>CDATA Sections</h1>
+			<p class="issue">...</p>
+		</section>
+
+		<section>
+			<h1>Character References</h1>
+			<p class="issue">Discuss numeric and named character reference
+			   syntax.  May link to the list of entity references in a
+			   separate document, rather than trying to list them all in here.</p>
+		</section>
+	</section>
+
+	<section>
+		<h1>Understanding MIME Types</h1>
+		<p class="issue">Discuss <code>text/html</code>, <code>application/xhtml+xml</code>, etc.</p>
+	</section>
+
+	<section>
+		<h1>Character Encoding</h1>
+		<p class="issue">Overview of Unicode, character repertoires, encodings, etc.
+		  Declaring the encoding with the Content-Type header, BOM, <code>meta</code>, etc.</p>
+	</section>
+
+	<section>
+		<h1>Polyglot Documents</h1>
+
+		<p>A polyglot HTML document is a document that conforms to both the
+		   HTML and XHTML syntactic requirements, and which can be processed
+		   as either by browsers, depending on the MIME type used. This works
+		   by using a common subset of the syntax that is shared by both HTML
+		   and XHTML.</p>
+
+		<p>Polyglot documents are useful to create for situations where a
+		   document is intended to be served as either HTML or XHTML,
+		   depending on the support in particular browsers, or when it is
+		   not known at the time of creation, which MIME type the document
+		   will ultimately be served as.</p>
+
+		<p>In order to successfully create and maintain polyglot documents,
+		   authors need to be familiar with both the similarities and
+		   differences between the two syntaxes.  This includes not only
+		   syntactic differences, but also differences in the way stylesheets,
+		   and scripts are handled, and the way in which character encodings
+		   are detected.</p>
+		
+		<p>This section will provide the details about each of these similarities
+		   and differences, and provide guidelines on the creation of polyglot
+		   documents.</p>
+
+		<p class="issue">Base this on the <a href="http://wiki.whatwg.org/wiki/HTML_vs._XHTML" title="HTML vs. XHTML - WHATWG Wiki">HTML vs. XHTML</a> article.</p>
+	</section>
 </section>
 
 <!-- The HTML Vocabulary and APIs -->
Received on Thursday, 5 March 2009 13:57:53 UTC