W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] Test cases for parsing spec (Was: Re: Provding Better Tools)

From: Karl Dubost <karl@w3.org>
Date: Thu, 7 Dec 2006 14:55:07 +0900
Message-ID: <645E909E-7D8E-42AC-841A-DAA954F0780E@w3.org>
Sam,

Le 6 d?c. 2006 ? 23:13, Sam Ruby a ?crit :
> My original interest was to write a replacement for Python's  
> SGMLLIB, i.e., one that was not based on the theoretical ideal of  
> how SGML vocabularies work, but one based on the practical notion  
> of how HTML actually is parsed.

I'm not sure sgmllib would be the best target. Specifically if it's  
used in many other products. But maybe you are talking about a new  
library altogether.


     http://docs.python.org/lib/module-sgmllib.html
     8.2 sgmllib -- Simple SGML parser

     This module defines a class SGMLParser which serves as the basis  
for
     parsing text files formatted in SGML (Standard Generalized Mark-up
     Language). In fact, it does not provide a full SGML parser -- it  
only
     parses SGML insofar as it is used by HTML, and the module only  
exists
     as a base for the htmllib module. Another HTML parser which  
supports
     XHTML and offers a somewhat different interface is available in the
     HTMLParser module.

It seems a better candidate.

     http://docs.python.org/lib/module-HTMLParser.html
     8.1 HTMLParser -- Simple HTML and XHTML parser

      New in version 2.2.

     This module defines a class HTMLParser which serves as the basis  
for
     parsing text files formatted in HTML (HyperText Mark-up  
Language) and
     XHTML. Unlike the parser in htmllib, this parser is not based on  
the
     SGML parser in sgmllib.


I'm adding them to the list of HTML parsers.
http://esw.w3.org/topic/HTMLAsSheAreSpoke




-- 
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager, QA Activity Lead
   QA Weblog - http://www.w3.org/QA/
      *** Be Strict To Be Cool ***
Received on Wednesday, 6 December 2006 21:55:07 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:50 UTC