W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2008

Re: Tidy - UTF-8 problem

From: anusha r <suisse.chocolat@gmail.com>
Date: Wed, 2 Jul 2008 03:34:43 -0700
Message-ID: <6c1f6f8a0807020334vc184c4ey5054edd585901f22@mail.gmail.com>
To: html-tidy@w3.org
On Wed, Jul 2, 2008 at 3:33 AM, anusha r <suisse.chocolat@gmail.com> wrote:

> Hi
>
> I am running Tidy on an input xml file containing a right single quotation
> mark --> [ ' ]. The document's encoding correctly specifies that it is
> UTF-8. Running Tidy from the command line outputs the quote properly while
> using jtidy substitues some other unicode characters for the quote.
> My cmd line options were: --escape-cdata false -xml -utf8
> My java code has:
>             tidy.setEscapeCdata(false);
>             tidy.setXmlTags(true);
>             tidy.setCharEncoding(Configuration.UTF8);

Could anyone help me with getting the quote as it is using JTidy?

Thanks
Anusha

P.S. sent the first incomplete mail by mistake
Received on Wednesday, 2 July 2008 15:29:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:59 GMT