- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Fri, 14 Feb 2003 14:46:31 +0100
- To: "Lucas W. Fletcher" <lucas@dealersinnotions.com>
- Cc: html-tidy@w3.org
* Lucas W. Fletcher wrote: >Is anyone aware of a publicly available program that >uses the MSHTML API to convert an HTML file into XHTML? I wrote a little Perl script that converts the Internet Explorer DOM to a SAX (simple API for XML) event stream (search the archives of tidy-develop@lists.sourceforge.net / perl-xml@lists.activestate.com). >If one assumes that the ultimate version of Tidy is one >where it can parse pages in as fault-tolerant a manner as >the popular browsers such as IE, then wouldn't it make sense >to actually utilize the DOM exposed by the browser itself >in order to create the XHTML? The DOM created by Internet Explorer from broken documents is rather useless. For example <p>1<em>2<strong>3</em>4</strong>5<p>6 In the MSHTML DOM this is represented as beeing <p>1<em>2<strong>34</strong></em>45</p> <p>6</p> while the expected result (what IE renders) is either <p>1<em>2</em><strong><em>3</em>4</strong>5</p> <p>6</p> or <p>1<em>2<strong>3</strong></em><strong>4</strong>5</p> <p>6</p>
Received on Friday, 14 February 2003 08:46:04 UTC