W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2001

Re: HTML Tidy Bug

From: Terry Teague <terry_teague@users.sourceforge.net>
Date: Thu, 5 Jul 2001 00:38:55 -0700
Message-Id: <l03130300b769c8b93f6a@[17.219.108.31]>
To: <html-tidy@w3.org>
At 2:44 PM -0400 7/4/01, protron@seznam.cz wrote:
>    I've downloaded HTML Tidy recently and found  quite serious bug: it
>does'n support unicode (UTF-8). For example: correct sequence 0xC5 0xBE
>(LATIN  CAPITAL LETTER Z WITH CARON) was translated into  &Aring;&frac34;
>   
>

Works fine for me with the 04 Aug 00 (and later) version of Tidy (although
I cheated a bit to get that char sequence into the document). You need to
specify UTF-8 as the char encoding when using Tidy - depending on your
platform, this may be as simple as specifying "-utf8" on the command line
(otherwise the default "-ascii" will give the results you described above).
Perhaps you have some conflicting options set in a config file.

Hope this helps.

Regards, Terry
Received on Thursday, 5 July 2001 03:41:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:46 GMT