W3C home > Mailing lists > Public > public-mobileok-checker@w3.org > March 2007

HTML tidying package for Java

From: Sean Owen <srowen@google.com>
Date: Tue, 20 Mar 2007 21:25:09 -0400
Message-ID: <e920a71c0703201825i3e4e9cabm524ec93ae39c2864@mail.gmail.com>
To: public-mobileok-checker@w3.org

Per my action, I did a little digging on HTML-tidying packages for
Java. My pick:

HtmlCleaner - http://htmlcleaner.sourceforge.net/
This worked pretty well in my informal testing and looks well maintained

I could be talked into something else -- this just looks best initially.


Other possibilities I considered:

TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/
Also looks good, though not Java 5 / 6 compatible??

NekoHTML - http://people.apache.org/~andyc/neko/doc/html/
Looks OK, if a bit more out of date and less full-featured

Java Mozilla HTML Parser - http://sourceforge.net/projects/mozillaparser
Looks like it's in development

JTidy - http://sourceforge.net/projects/jtidy
A port of the W3C's HTML Tidy code to Java, but, hasn't been updated in 7 years.

Sean
Received on Wednesday, 21 March 2007 01:25:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:13:02 GMT