- From: Ruadhan O'Donoghue <rodonoghue@mtld.mobi>
- Date: Wed, 21 Mar 2007 06:31:11 -0400
- To: "Sean Owen" <srowen@google.com>, <public-mobileok-checker@w3.org>
FWIW, I came across a few of these when scoping ready.mobi. We are using the TagSoup parser in ready.mobi, and it is extremely robust. I've used it with Java 4 & 5, but not 6. I considered JTidy also, but it worried me too much that it was not maintained. Ruadhan > -----Original Message----- > From: public-mobileok-checker-request@w3.org [mailto:public-mobileok- > checker-request@w3.org] On Behalf Of Sean Owen > Sent: 21 March 2007 01:25 > To: public-mobileok-checker@w3.org > Subject: HTML tidying package for Java > > > Per my action, I did a little digging on HTML-tidying packages for > Java. My pick: > > HtmlCleaner - http://htmlcleaner.sourceforge.net/ > This worked pretty well in my informal testing and looks well maintained > > I could be talked into something else -- this just looks best initially. > > > Other possibilities I considered: > > TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/ > Also looks good, though not Java 5 / 6 compatible?? > > NekoHTML - http://people.apache.org/~andyc/neko/doc/html/ > Looks OK, if a bit more out of date and less full-featured > > Java Mozilla HTML Parser - http://sourceforge.net/projects/mozillaparser > Looks like it's in development > > JTidy - http://sourceforge.net/projects/jtidy > A port of the W3C's HTML Tidy code to Java, but, hasn't been updated in 7 > years. > > Sean
Received on Wednesday, 21 March 2007 10:33:16 UTC