W3C home > Mailing lists > Public > www-html@w3.org > April 2007

Re: [XBL Primer] new scenarios

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Sat, 21 Apr 2007 19:55:03 +1000
Message-ID: <4629DF77.1030701@lachy.id.au>
To: David Woolley <forums@david-woolley.me.uk>
CC: www-html@w3.org

David Woolley wrote:
> Lachlan Hunt wrote:
>> Not any more.  Although it's note quite complete, HTML5 is defining 
>> the parsing requirements of HTML on the web.
> And a horrible set of ad hoc rules it is. It's basically a proper tree
> type grammar with a set of error recovery rules for producing a
> renderable tree from almost every invalid input.

Yeah, it's designed to handle real world HTML.  The WHATWG places 
interoperability with existing content above syntactic purity.

> Maybe what I should have offered is three categories:
> - tag soup;
> - HTML5 with *no* parse errors;
> - SGML based.
> I'm assuming that HTML5 *with* parse errors can produce all the 
> productions allowed by tag soup.  In a quick scan, I couldn't tell what, 
> if any difference there is between HTML5 without parse errors and SGML 
> based.  In any case, it seems to be quite close to SGML based.

There are pleny of differences between HTML5 parsing and HTML4's SGML 
parsing.  Here's just a few:

In HTML5, <br/> is conforming and equivalent to <br>
In HTML4, <br/> is equivalent to <br>&gt;

In HTML5, characters like '/' can occur in unquoted attribute values
e.g. <a href=http://example.com/>link</a>

In HTML4, that would be equivalent to:
   <a href="http:"></a>example.com/&gt;link</a>

In HTML5, <noscript> is parsed differently depending on whether or not 
script is enabled.

There's plenty more differences, but that should be enough to illustrate 
the point.

Lachlan Hunt
Received on Saturday, 21 April 2007 09:55:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:15 UTC