Re: Unicode conversion using tudyHTML from Fred Bone on 2007-01-02 (html-tidy@w3.org from January to March 2007)

From: Fred Bone <Fred.Bone@dial.pipex.com>
Date: Tue, 02 Jan 2007 16:00:57 -0000
To: html-tidy@w3.org
Message-ID: <459A81B9.15115.1E7679BF@Fred.Bone.dial.pipex.com>

On 1 Jan 2007 at 23:11, Lissa Labada said:

> Hello,
> 
> Thank you so much for your reply.
> 
> Our application does not use a file as input. Rather, users typically
> copy/cut from, say, Word document, then paste onto the HTML control of our
> application. The application then calls tidyHTML to tidy up the content.
> 
> The copied texts typically contain Unicode characters, thus when tidyHTML
> is called, the characters are converted to question mark of other
> characters, but we want to retain the original charcters.
> 
> What is the best way to handle this?

If the text is in Unicode, Tidy can handle it. Use the -utf16 
option.

Received on Tuesday, 2 January 2007 16:01:43 UTC