W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2006

Problems with Char-encoding - HTML TIDY

From: Luana Knoff <formiga_lua@yahoo.com.br>
Date: Fri, 7 Jul 2006 17:14:25 -0300 (ART)
Message-ID: <20060707201425.88562.qmail@web32101.mail.mud.yahoo.com>
To: html tidy <html-tidy@w3.org>
Hi all,

I have some doubts about using HTML Tidy. I need to convert the char-encoding to utf-8 but 

when I do it some strangers characters appear instead of "š" and "accentuations", like in 

this case: The word: "Servišos" appears as "Servi?os". But if I do the transformation first 

to ascii and after it to utf-8 the strangers characters don't appear. 

The comand lines are:
first I do:

tidy  --char-encoding ascii --tidy-mark no --wrap 99 --output-xml yes --output-xhtml yes 

--output-html yes --doctype omit --numeric-entities yes --quote-marks yes --quote-nbsp yes 

--quote-ampersand yes --logical-emphasis yes --enclose-text yes --alt-text empty 

--write-back yes --quiet yes -m teste_imagem.htm


And then I have to do again:

tidy  --char-encoding utf8 --tidy-mark no --wrap 99 --output-xml yes --output-xhtml yes 

--output-html yes --doctype omit --numeric-entities yes --quote-marks yes --quote-nbsp yes 

--quote-ampersand yes --logical-emphasis yes --enclose-text yes --alt-text empty 

--write-back yes --quiet yes -m teste_imagem.htm

Anyone knows what I have to do to convert the page to utf-8 without have to do the 

transformation twice? 

Any hints are welcome.

Luana 


 		
---------------------------------
 Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. Registre seu aparelho agora!
Received on Friday, 7 July 2006 20:17:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT