RE: HTML tidying package for Java

FWIW, I came across a few of these when scoping ready.mobi.

We are using the TagSoup parser in ready.mobi, and it is extremely
robust. I've used it with Java 4 & 5, but not 6.

I considered JTidy also, but it worried me too much that it was not
maintained.

Ruadhan
 
> -----Original Message-----
> From: public-mobileok-checker-request@w3.org [mailto:public-mobileok-
> checker-request@w3.org] On Behalf Of Sean Owen
> Sent: 21 March 2007 01:25
> To: public-mobileok-checker@w3.org
> Subject: HTML tidying package for Java
> 
> 
> Per my action, I did a little digging on HTML-tidying packages for
> Java. My pick:
> 
> HtmlCleaner - http://htmlcleaner.sourceforge.net/
> This worked pretty well in my informal testing and looks well
maintained
> 
> I could be talked into something else -- this just looks best
initially.
> 
> 
> Other possibilities I considered:
> 
> TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/
> Also looks good, though not Java 5 / 6 compatible??
> 
> NekoHTML - http://people.apache.org/~andyc/neko/doc/html/
> Looks OK, if a bit more out of date and less full-featured
> 
> Java Mozilla HTML Parser -
http://sourceforge.net/projects/mozillaparser
> Looks like it's in development
> 
> JTidy - http://sourceforge.net/projects/jtidy
> A port of the W3C's HTML Tidy code to Java, but, hasn't been updated
in 7
> years.
> 
> Sean

Received on Wednesday, 21 March 2007 10:33:16 UTC