W3C home > Mailing lists > Public > www-international@w3.org > July to September 2001

Re: Java and Unicode

From: Mark Davis <mark@macchiato.com>
Date: Wed, 26 Sep 2001 07:30:00 -0700
Message-ID: <004901c14697$b95b9f00$0c680b41@c1340594a>
To: "souravm" <souravm@infy.com>, <www-international@w3.org>
Java uses UTF-16, but it has little support for supplementary characters. If
you need to deal with them, we have an add-on package called ICU4J
(http://www-124.ibm.com/icu4j/) that provides support for manipulating
surrogate characters and comparing strings in code point order (see the
class UTF16). It also supplies updated character properties, that will have
Unicode 3.1 when ICU4J 2.0 ships (in a couple of months).

Mark
—————

Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης
[http://www.macchiato.com]

----- Original Message -----
From: "souravm" <souravm@infy.com>
To: <www-international@w3.org>
Sent: Tuesday, September 25, 2001 11:13 PM
Subject: Java and Unicode


> Hi All,
>
> Java supposed to store all strings internally in Unicode. In that case
> what is the encoding form of Unicode (i.e. UTF-8/UTF-16/UTF-32) used ?
>
> To be more specific -
>
> Let us consider a string, strInput, which contains characters encoded
> using Shift_JIS as encoding type.
> Following code is supposed to convert it to Unicode.
>
> String strConv = new String(strInput.getBytes(), "Shift_JIS");
>
> My query is what would be the encoding type of the string strConv -
> UTF-8/UTF-16 or UTF-32 ?
>
> Regards,
> Sourav
>
>
Received on Wednesday, 26 September 2001 10:29:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:57 GMT