- From: Addison Phillips [wM] <aphillips@webmethods.com>
- Date: Wed, 06 Aug 2003 09:11:35 -0700
- To: "Audrey Ng (by way of Martin Duerst <duerst@w3.org>)" <audrey@nxspace.com>
- CC: www-international@w3.org
Hi Audrey, The problem is that you are retrieving the string "\u5000\u5001" and not the characters that you are trying to represent by using an escape sequence. A properties file is converted from another encoding when it is read in (and ListResourceBundles are converted by javac to true Unicode sequences). Another way to say this is that you are really retrieving the string "\\u5000\\u5001" ! It's important to remember that java.lang.String objects are always Unicode internally. It is how you convert to/from external sources that matters. In the case of a database, though, you are retrieving String objects using JDBC. The conversion is done somewhere else, outside your control. Presumably you had to write some code to insert \u5000 (etc.) into your database instead of the character U+5000. You have to reverse that encoding procedure to retrieve the original character. Recent mySQL versions (since 8.5) can use the UTF-8 (or UCS-2, aka UTF-16) encoding of Unicode. Then you just read/write String objects (which are always encoded as Unicode) to/from the database (and not worry about encodings) and not mess with escape sequences. This is a far better choice, since it means that you can also access the data in the database directly. Best Regards, Addison -- Addison P. Phillips Director, Globalization Architecture webMethods, Inc. +1 408.962.5487 mailto:aphillips@webmethods.com ------------------------------------------- Internationalization is an architecture. It is not a feature. Chair, W3C I18N WG Web Services Task Force http://www.w3.org/International/ws Audrey Ng (by way of Martin Duerst ) wrote: > > > Hi all, > > this is my very first project dealing with internationalization and I am > very confused about all these character sets and encodings. Any help > would be most welcome. > Ok, I need to display Chinese(traditional and simplified) and Thai on a > website. I am using Tomcat4.1 and mySQL 4.0.14. How do I store these > Chinese and Thai characters in mySQL? > Can I store the unicode escape sequence like \u5000\u5001 directly in > mySQL? > I have tried that, but when I retrieve the data in my servlet and then > forward it to a JSP to display the result, the characters are displayed > as such \u5000\u5001 and not in chinese. I have set the content type in > my page directive as well as the META content-type to UTF-8 already. > I have tried using Resourbundles and the Chinese characters are correcly > displayed. > What is the difference between retrieving the unicode escape sequence > from the properties file and from the database. > > Please help! > Audrey > >
Received on Wednesday, 6 August 2003 12:20:58 UTC