W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] Test cases for parsing spec (Was: Re: Provding Better Tools)

From: Anne van Kesteren <annevk@opera.com>
Date: Wed, 06 Dec 2006 15:19:48 +0100
Message-ID: <op.tj453az264w2qv@id-c0020.oslo.opera.com>
On Wed, 06 Dec 2006 15:13:26 +0100, Sam Ruby <rubys at intertwingly.net>  
wrote:
> Count me in.  This is actually closer to the original reason why I  
> originally subscribed to this list.  If given a few tests, I could  
> convert them into a useful form,and this form could serve as a model for  
> future tests.
>
> My original interest was to write a replacement for Python's SGMLLIB,  
> i.e., one that was not based on the theoretical ideal of how SGML  
> vocabularies work, but one based on the practical notion of how HTML  
> actually is parsed.

The HTMLTokenizer for such a project is mostly finished already:

   http://code.google.com/p/html5lib/

(As in, it actually emits the tokens it has to. I'm quite happy about it!)

James Graham has been working on the Tree Construction part of the process  
(called HTMLParser in parser.py) and Lachlan Hunt is working on an  
HTMLInputStream class which handles some of the specifics needed for the  
input stream.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>
Received on Wednesday, 6 December 2006 06:19:48 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:31 UTC