- From: Sean Owen <srowen@google.com>
- Date: Wed, 21 Mar 2007 09:53:57 -0400
- To: "Ruadhan O'Donoghue" <rodonoghue@mtld.mobi>
- Cc: public-mobileok-checker@w3.org
I'm open to TagSoup if it in fact works in Java 5, and has proved useful in practice. HtmlCleaner *looked* better but only at first glance and with a little experimentation. On 3/21/07, Ruadhan O'Donoghue <rodonoghue@mtld.mobi> wrote: > FWIW, I came across a few of these when scoping ready.mobi. > > We are using the TagSoup parser in ready.mobi, and it is extremely > robust. I've used it with Java 4 & 5, but not 6. > > I considered JTidy also, but it worried me too much that it was not > maintained. > > Ruadhan > > > -----Original Message----- > > From: public-mobileok-checker-request@w3.org [mailto:public-mobileok- > > checker-request@w3.org] On Behalf Of Sean Owen > > Sent: 21 March 2007 01:25 > > To: public-mobileok-checker@w3.org > > Subject: HTML tidying package for Java > > > > > > Per my action, I did a little digging on HTML-tidying packages for > > Java. My pick: > > > > HtmlCleaner - http://htmlcleaner.sourceforge.net/ > > This worked pretty well in my informal testing and looks well > maintained > > > > I could be talked into something else -- this just looks best > initially. > > > > > > Other possibilities I considered: > > > > TagSoup - http://home.ccil.org/~cowan/XML/tagsoup/ > > Also looks good, though not Java 5 / 6 compatible?? > > > > NekoHTML - http://people.apache.org/~andyc/neko/doc/html/ > > Looks OK, if a bit more out of date and less full-featured > > > > Java Mozilla HTML Parser - > http://sourceforge.net/projects/mozillaparser > > Looks like it's in development > > > > JTidy - http://sourceforge.net/projects/jtidy > > A port of the W3C's HTML Tidy code to Java, but, hasn't been updated > in 7 > > years. > > > > Sean > >
Received on Wednesday, 21 March 2007 13:54:09 UTC