HTML tidying package for Java

Per my action, I did a little digging on HTML-tidying packages for
Java. My pick:

HtmlCleaner - http://htmlcleaner.sourceforge.net/
This worked pretty well in my informal testing and looks well maintained

I could be talked into something else -- this just looks best initially.


Other possibilities I considered:

TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/
Also looks good, though not Java 5 / 6 compatible??

NekoHTML - http://people.apache.org/~andyc/neko/doc/html/
Looks OK, if a bit more out of date and less full-featured

Java Mozilla HTML Parser - http://sourceforge.net/projects/mozillaparser
Looks like it's in development

JTidy - http://sourceforge.net/projects/jtidy
A port of the W3C's HTML Tidy code to Java, but, hasn't been updated in 7 years.

Sean

Received on Wednesday, 21 March 2007 01:25:28 UTC