Re: HTML Tidy Bug

At 2:44 PM -0400 7/4/01, wrote:
>    I've downloaded HTML Tidy recently and found  quite serious bug: it
>does'n support unicode (UTF-8). For example: correct sequence 0xC5 0xBE
>(LATIN  CAPITAL LETTER Z WITH CARON) was translated into  ž

Works fine for me with the 04 Aug 00 (and later) version of Tidy (although
I cheated a bit to get that char sequence into the document). You need to
specify UTF-8 as the char encoding when using Tidy - depending on your
platform, this may be as simple as specifying "-utf8" on the command line
(otherwise the default "-ascii" will give the results you described above).
Perhaps you have some conflicting options set in a config file.

Hope this helps.

Regards, Terry

Received on Thursday, 5 July 2001 03:41:01 UTC