Re: Java and Unicode from Mark Davis on 2001-09-26 (www-international@w3.org from July to September 2001)

From: Mark Davis <mark@macchiato.com>
Date: Wed, 26 Sep 2001 07:30:00 -0700
To: "souravm" <souravm@infy.com>, <www-international@w3.org>
Message-ID: <004901c14697$b95b9f00$0c680b41@c1340594a>

Java uses UTF-16, but it has little support for supplementary characters. If
you need to deal with them, we have an add-on package called ICU4J
(http://www-124.ibm.com/icu4j/) that provides support for manipulating
surrogate characters and comparing strings in code point order (see the
class UTF16). It also supplies updated character properties, that will have
Unicode 3.1 when ICU4J 2.0 ships (in a couple of months).

Mark
—————

Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης
[http://www.macchiato.com]

----- Original Message -----
From: "souravm" <souravm@infy.com>
To: <www-international@w3.org>
Sent: Tuesday, September 25, 2001 11:13 PM
Subject: Java and Unicode


> Hi All,
>
> Java supposed to store all strings internally in Unicode. In that case
> what is the encoding form of Unicode (i.e. UTF-8/UTF-16/UTF-32) used ?
>
> To be more specific -
>
> Let us consider a string, strInput, which contains characters encoded
> using Shift_JIS as encoding type.
> Following code is supposed to convert it to Unicode.
>
> String strConv = new String(strInput.getBytes(), "Shift_JIS");
>
> My query is what would be the encoding type of the string strConv -
> UTF-8/UTF-16 or UTF-32 ?
>
> Regards,
> Sourav
>
>

Received on Wednesday, 26 September 2001 10:29:39 UTC