W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2008

Tidy - UTF-8 problem

From: anusha r <suisse.chocolat@gmail.com>
Date: Wed, 2 Jul 2008 03:33:04 -0700
Message-ID: <6c1f6f8a0807020333k33568ed5w448ce33fd4f8848e@mail.gmail.com>
To: html-tidy@w3.org
Hi

I am running Tidy on an input xml file containing a right single quotation
mark --> [ ' ]. The document's encoding correctly specifies that it is
UTF-8. Running Tidy from the command line outputs the quote properly while
using jtidy substitues some other unicode characters for the quote.
My cmd line options were: --escape-cdata false -xml -utf8
My java code has:
 tidy.setEscapeCdata(false);
            tidy.setXmlTags(true);
           tidy.setCharEncoding(Configuration.UTF8);
Received on Wednesday, 2 July 2008 15:29:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:59 GMT