W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2003

JTidy I18N Solution. Re: JTidy Status and I18N support

From: Stephen Riek <stephenriek@yahoo.co.uk>
Date: Fri, 7 Feb 2003 09:50:42 +0000 (GMT)
Message-ID: <20030207095042.93594.qmail@web20604.mail.yahoo.com>
To: html-tidy@w3.org

Answering one of my own questions,

Found out that JTidy DOES work fine with multibyte character sets,
using tidy.setCharEncoding(Configuration.RAW.

Thanks to a previous thread.
 Stephen Riek <stephenriek@yahoo.co.uk> wrote:
Please excuse me if this is the inappropriate forum but
may I ask whether JTidy is being maintained? (There seem
to be many more features in HmtlTidy, and the last release
of JTidy was more than 18 months ago). 

I'm currently using JTidy and am very impressed with it,
and will try to use it even if it no longer being maintained.
However, I worry about its i18n support. Using the following
static method, I tried to parse a known UTF8 String (containing
Asian characters) and JTidy returned an empty string,

public static String parse(String s, String encoding) {
        try {
            ByteArrayInputStream bis = new ByteArrayInputStream(s.getBytes(encoding));
            Tidy tidy = new Tidy();
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            tidy.parse(bis, bos);
            return bos.toString(encoding);
        catch (java.io.UnsupportedEncodingException e) {
            System.out.println("oops " + e.getMessage());
          ;!    return s;

Should JTidy be fully i18n aware? 

Thank you and apologies for my ignorance,


With Yahoo! Mail you can get a bigger mailbox -- choose a size that fits your needs

With Yahoo! Mail you can get a bigger mailbox -- choose a size that fits your needs
Received on Friday, 7 February 2003 04:50:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:53 UTC