W3C home > Mailing lists > Public > whatwg@whatwg.org > February 2007

[whatwg] XSLT: HTML 5 --> HTML

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 6 Feb 2007 13:59:26 +0200
Message-ID: <4ED076D0-4B7F-4259-A8D6-7579B884428D@iki.fi>
On Feb 6, 2007, at 13:23, Elliotte Harold wrote:

> It would probably have to be done in two parts. First make the  
> document well-formed (possibly with a TagSoup fork). Then run the  
> stylesheet. The problem with TagSoup is that it treats bogons  
> (unknown elements as empty). It also doesn't quite follow Web Apps  
> 1.0's error recovery algorithm. Possibly I could base the initial  
> step on html5lib instead.

My parser[1] doesn't follow the WA10 parsing algorithm, either,  
*yet*. However, as a tentative Pythonless Java solution, you could  
use it together with a RELAX NG validator in the pipeline (using the  
whattf.org schemas[2]) to implement Draconian failure in cases where  
the error recovery would kick in as per the WA10 parsing algorithm.

Basically, the parser would report to a ContentHandler splitter. The  
splitter would show each SAX event to Jing/oNVDL first. The validator  
would use DraconianErrorHandler (Jing/oNVDL is fail-fast). Second,  
each SAX event would be shown to a TrAX TransformerHandler.

[1] http://hsivonen.iki.fi/validator-about/htmlparser.jar
[2] http://syntax.whattf.org/

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
Received on Tuesday, 6 February 2007 03:59:26 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:32 UTC