- From: Terry Teague <terry_teague@users.sourceforge.net>
- Date: Thu, 5 Jul 2001 00:38:55 -0700
- To: <html-tidy@w3.org>
At 2:44 PM -0400 7/4/01, protron@seznam.cz wrote: > I've downloaded HTML Tidy recently and found quite serious bug: it >does'n support unicode (UTF-8). For example: correct sequence 0xC5 0xBE >(LATIN CAPITAL LETTER Z WITH CARON) was translated into ž > > Works fine for me with the 04 Aug 00 (and later) version of Tidy (although I cheated a bit to get that char sequence into the document). You need to specify UTF-8 as the char encoding when using Tidy - depending on your platform, this may be as simple as specifying "-utf8" on the command line (otherwise the default "-ascii" will give the results you described above). Perhaps you have some conflicting options set in a config file. Hope this helps. Regards, Terry
Received on Thursday, 5 July 2001 03:41:01 UTC