W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2008

Re: gb2312 encoding

From: Arnaud Desitter <arnaud02@users.sourceforge.net>
Date: Mon, 4 Aug 2008 16:08:31 +0100
Message-ID: <a240ddd00808040808n680bccf5i14e8ec869cd7f163@mail.gmail.com>
To: alex242 <pshenichnov@gmail.com>
Cc: html-tidy@w3.org

Hi,

You need to convert your files from whatever encoding they are in
(gb2313 in your case) to UTF8. iconv is an option.

Regards,

2008/8/4 alex242 <pshenichnov@gmail.com>:
>
> Hello,
>
> I have some problems with HTML documents in gb2313 (simple chinese)
> encoding. After using Tidy I get some unreadable characters. Example of
> document:  http://www.chemspider.com/ArticlesHandler.ashx?type=art&id=69
> http://www.chemspider.com/ArticlesHandler.ashx?type=art&id=69
>
> my Tidy config file:
>
> output-file: res.html
> error-file: error.txt
> char-encoding: utf8
> output-bom: yes
> output-encoding: utf8
>
> I've already spent several days trying to solve this problem and without any
> success... so, if sombody can give some advise how to work with different
> encodings in Tide, it will be much appreciated.
>
> best regards,
> Alex
> --
> View this message in context: http://www.nabble.com/gb2312-encoding-tp18803906p18803906.html
> Sent from the w3.org - html-tidy mailing list archive at Nabble.com.
>
>
>
>
Received on Monday, 4 August 2008 15:10:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:59 GMT