W3C home > Mailing lists > Public > www-international@w3.org > July to September 2003

RE: displaying Chinese and Thai characters

From: Paul Deuter <PaulD@plumtree.com>
Date: Wed, 06 Aug 2003 13:30:32 -0400
Message-Id: <4.2.0.58.J.20030806133024.046b27d0@localhost>
To: www-international@w3.org




The \uHHHH format is a Java convention for storing Unicode characters
in plain text files.  Instead, you could choose the HTML standard for numerical
character references (NCRs) for your Unicode characters: "&#xHHHH;".
For example, if your HTML includes &#x5000;&#x5001; your browser will
correctly render these NCRs as Chinese characters.

-Paul
http://www.w3.org/TR/REC-html40/charset.html

-----Original Message-----
From: Audrey Ng [mailto:audrey@nxspace.com]
Sent: Wednesday, August 06, 2003 6:46 AM
To: www-international@w3.org
Subject: displaying Chinese and Thai characters




Hi all,

this is my very first project dealing with internationalization and I am
very confused about all these character sets and encodings. Any help would
be most welcome.
Ok, I need to display Chinese(traditional and simplified) and Thai on a
website. I am using Tomcat4.1 and mySQL 4.0.14. How do I store these
Chinese and Thai characters in mySQL?
Can I store the unicode escape sequence like \u5000\u5001 directly in mySQL?
I have tried that, but when I retrieve the data in my servlet and then
forward it to a JSP to display the result, the characters are displayed as
such \u5000\u5001 and not in chinese. I have set the content type in my
page directive as well as the META content-type to UTF-8 already.
I have tried using Resourbundles and the Chinese characters are correcly
displayed.
What is the difference between retrieving the unicode escape sequence from
the properties file and from the database.

Please help!
Audrey
Received on Wednesday, 6 August 2003 16:05:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:00 GMT