Re: Unicode conversion using tudyHTML

Hello,

Thank you so much for your reply.

Our application does not use a file as input. Rather, users typically copy/cut from, say, Word document, then paste onto the HTML control of our application. The application then calls tidyHTML to tidy up the content.

The copied texts typically contain Unicode characters, thus when tidyHTML is called, the characters are converted to question mark of other characters, but we want to retain the original charcters.

What is the best way to handle this?

Thanks fo your help,
Anna

----- Original Message ----
From: Yvon Thoraval <yvon_thoraval@mac.com>
To: Lissa Labada <lissalabada@yahoo.com.ph>
Cc: html-tidy@w3.org
Sent: Friday, December 22, 2006 1:15:20 AM
Subject: Re: Unicode conversion using tudyHTML



Le jeudi 20 d¸«±c. 06 19:06 ¸«¢ 12:08, Lissa Labada a ¸«±crit :


 
Is there a way to use tidyHTML such that it will return back the original character?  If not, what is the best way to handle this?
 




transcode your orginal file to UTF-8 (or 16) first...


Yvon


 
----- Original Message ----
From: Lissa Labada <lissalabada@yahoo.com.ph>
To: html-tidy@w3.org
Sent: Friday, December 21, 2006 1:15:20 AM
Subject: Unicode conversion using tudyHTML

Hello,
 
Our application uses tidyHTML to clean-up the contents of an HMTL control object. However, there are cases when after passing a string to tidyHTML, some characters are converted to question mark (?).
 
For instance, the rightward arrow ¢Ŗ
(f you are using Word document, the short cut key is 2192, Alt-X).
 
Is there a way to use tidyHTML such that it will return back the original character?  If not, what is the best way to handle this?
 
Thanks! :)

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Received on Tuesday, 2 January 2007 07:12:12 UTC