Re: HTML tidying package for Java

I'm open to TagSoup if it in fact works in Java 5, and has proved
useful in practice. HtmlCleaner *looked* better but only at first
glance and with a little experimentation.

On 3/21/07, Ruadhan O'Donoghue <rodonoghue@mtld.mobi> wrote:
> FWIW, I came across a few of these when scoping ready.mobi.
>
> We are using the TagSoup parser in ready.mobi, and it is extremely
> robust. I've used it with Java 4 & 5, but not 6.
>
> I considered JTidy also, but it worried me too much that it was not
> maintained.
>
> Ruadhan
>
> > -----Original Message-----
> > From: public-mobileok-checker-request@w3.org [mailto:public-mobileok-
> > checker-request@w3.org] On Behalf Of Sean Owen
> > Sent: 21 March 2007 01:25
> > To: public-mobileok-checker@w3.org
> > Subject: HTML tidying package for Java
> >
> >
> > Per my action, I did a little digging on HTML-tidying packages for
> > Java. My pick:
> >
> > HtmlCleaner - http://htmlcleaner.sourceforge.net/
> > This worked pretty well in my informal testing and looks well
> maintained
> >
> > I could be talked into something else -- this just looks best
> initially.
> >
> >
> > Other possibilities I considered:
> >
> > TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/
> > Also looks good, though not Java 5 / 6 compatible??
> >
> > NekoHTML - http://people.apache.org/~andyc/neko/doc/html/
> > Looks OK, if a bit more out of date and less full-featured
> >
> > Java Mozilla HTML Parser -
> http://sourceforge.net/projects/mozillaparser
> > Looks like it's in development
> >
> > JTidy - http://sourceforge.net/projects/jtidy
> > A port of the W3C's HTML Tidy code to Java, but, hasn't been updated
> in 7
> > years.
> >
> > Sean
>
>

Received on Wednesday, 21 March 2007 13:54:09 UTC