W3C home > Mailing lists > Public > www-archive@w3.org > October 2007

HTML parsing model (was: Re: Some thoughts (on questions) on RDFa in HTML5)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sat, 6 Oct 2007 23:32:05 +0300
Message-Id: <5FDE8075-6992-4072-B0F9-9E82FF95ABF7@iki.fi>
Cc: www-archive@w3.org
To: Peter Krantz <peter.krantz@gmail.com>

-public-html
+www-archive

On Oct 6, 2007, at 14:35, Peter Krantz wrote:

> In XHTML you apply XSLT on a test
> page and the result is compared to the intended result. This is easy
> because the parsing rules are defined. Is there a canonical parsing
> model for HTML that makes it possible to test conformance in a similar
> way?

Parsing HTML is defined in
http://www.w3.org/html/wg/html5/#parsing
(or http://www.whatwg.org/specs/web-apps/current-work/#parsing if the  
W3C server is unresponsive.

There are implementations in Python, Ruby and Java:
http://code.google.com/p/html5lib/ (Python and Ruby)
http://about.validator.nu/htmlparser/ (Java)

There's also a "not usable yet" C# implementation:
http://code.google.com/p/twintsam/

The Java implementation comes with sample code for using XSLT with  
HTML5.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Saturday, 6 October 2007 20:32:36 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:33:16 UTC