Re: RDFa in HTML vs XHTML from Toby Inkster on 2011-11-12 (public-html-data-tf@w3.org from November 2011)

From: Toby Inkster <tai@g5n.co.uk>
Date: Sat, 12 Nov 2011 21:44:25 +0000
To: Jeni Tennison <jeni@jenitennison.com>
Cc: Henri Sivonen <hsivonen@iki.fi>, HTML Data Task Force WG <public-html-data-tf@w3.org>
Message-ID: <20111112214425.04362569@miranda.g5n.co.uk>

On Sat, 12 Nov 2011 20:45:22 +0000
Jeni Tennison <jeni@jenitennison.com> wrote:

> as a valid HTML5 document despite it not having a <body> element, but
> perhaps that's a validator.nu bug…)

In HTML, some elements have optional end tags, right? So the following
is valid:

 <p>Foo
 <p>Bar

Less well known is that some elements also have optional start tags.
IIRC, only four such elements exist: <html>, <head>, <body> and
<tbody>, all of which co-incidentally have optional end tags too.
Thus it is possible to have elements which exist, and are represented
in the DOM tree, but have no start or end tags in the byte stream.

So:

 <!doctype html>
 <title>Hello world</title>
 <p>Global greetings</p>

...is a complete and valid HTML document. This is not new in HTML5 -
change the doctype and it can be valid HTML 4.x, HTML 3.2 or HTML 2.0.
This has nothing to do with HTML5's error correcting parsing algorithm.
An HTML 4.01 version of that document, run through a strict SGML parser
should be parsed correctly, and the parser will understand that the
document contains <html>, <head> and <body> elements, and be able to
correctly infer where each of them begins and ends.

So my counter-intuitive example:

  <html about="#foo">
    <h1 property="dc:title">Hello World</h1>
    <p property="dc:description">A global greeting for all.</p>
        </html>

... will not flag up a handy error in an HTML validator. (Well, this
cut-down example will as it's missing the required <title> element.)

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Received on Saturday, 12 November 2011 21:43:40 UTC