HTML parsing model (was: Re: Some thoughts (on questions) on RDFa in HTML5)

-public-html
+www-archive

On Oct 6, 2007, at 14:35, Peter Krantz wrote:

> In XHTML you apply XSLT on a test
> page and the result is compared to the intended result. This is easy
> because the parsing rules are defined. Is there a canonical parsing
> model for HTML that makes it possible to test conformance in a similar
> way?

Parsing HTML is defined in
http://www.w3.org/html/wg/html5/#parsing
(or http://www.whatwg.org/specs/web-apps/current-work/#parsing if the  
W3C server is unresponsive.

There are implementations in Python, Ruby and Java:
http://code.google.com/p/html5lib/ (Python and Ruby)
http://about.validator.nu/htmlparser/ (Java)

There's also a "not usable yet" C# implementation:
http://code.google.com/p/twintsam/

The Java implementation comes with sample code for using XSLT with  
HTML5.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Saturday, 6 October 2007 20:32:36 UTC