W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2006

Re: Help with a japanese web page

From: Yvon Thoraval <yvon_thoraval@mac.com>
Date: Wed, 20 Dec 2006 16:07:02 +0100
Message-Id: <B6AE6E13-A591-42B8-A5C0-374D922C4D6B@mac.com>
Cc: html-tidy@w3.org
To: Tania Estébanez <fair_ithilien@yahoo.es>

Le jeudi 20 déc. 06 19:06 à 13:35, Tania Estébanez a écrit :

>
> http://hotwired.goo.ne.jp/news/print/20000414303.html
>


this page is EUC-JP encoded, i would be u, i'll first transcode it in  
UTF-8 and then make use of tidy for html to xml conversion.

what's the sort of langage u are using for programing ?

if Ruby, it's easy within ruby to make EUC-JP to UTF-8 (or UTF-16)  
transcoding.

in ruby (as in perl) their are also method to guess encoding from  
file input...

best,

Yvon
Received on Wednesday, 20 December 2006 15:07:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT