- From: Moshe Plotkin <mplotkin@hotmail.com>
- Date: Mon, 11 Nov 2002 13:19:28 -0800
- To: "Charles Reitzel" <creitzel@rcn.com>
- Cc: <html-tidy@w3.org>
B"H Actualy I work with Meir Kogan, and I have the string as a BSTR Which as far as I understand is just wchar_t * with a length. The vbscript page that I am getting it from is set to use codepage 65001 i.e. utf8. So I am asuming its utf8 stored in an array of wchar_t however that works. I was thinking of redfining the tidy string (whats it called cbmtstr?) to use wchar_t and then rely on the config file to alter the internal ... or to just cast the bstr to byte* and put it in the buffer. or maybe I'm way off. thanks for all the help. ----- Original Message ----- From: "Charles Reitzel" <creitzel@rcn.com> To: "Moshe Plotkin" <mplotkin@hotmail.com> Cc: <html-tidy@w3.org> Sent: Monday, November 11, 2002 6:27 AM Subject: Re: UTF8 without tempfiles > Hi Moshe, > > wchar_t is usually UTF16. What platform are you on? It helps to figure > out if you should use Little or Big Endian unicode (UTF16LE and UTF16BE, > respectively). If you can manage to save your documents with a byte-order > mark (two bytes at the beginning of the file that indicate the byte order), > you can specify plain UTF16. > > For example, Intel (Windows and Linux) are LE. Sparc (Solaris) and PowerPC > (Mac, IBM AIX) are BE. Alpha (Linux) can be either, but is usually LE. > > take it easy, > Charlie > > At 01:22 PM 11/10/2002 -0800, Moshe Plotkin wrote: > >B"H > > > >Can someone please send me a very simple example of using TidyLib with > >UTF8 strings. > > > >I have the data in a wchar_t* and would like to return a wchar_t* > > > >thank you verry much >
Received on Monday, 11 November 2002 13:23:01 UTC