W3C home > Mailing lists > Public > www-international@w3.org > October to December 2001

Re: Servlet question

From: Shigemichi Yazawa <yazawa@globalsight.com>
Date: Fri, 19 Oct 2001 11:01:45 -0600
Message-ID: <5eadynef7q.wl@globalsight.com>
To: www-international@w3.org
At Fri, 19 Oct 2001 15:29:24 +0200,
Thierry Sourbier <webmaster@i18ngurus.com> wrote:
> Well it is a case where 2 mistakes compensate one another :). You are
> relying on the default encoding for both the input and output when your data
> obviously is using a different encoding. This works fine only as your
> default encoding is likely a single byte with no invalid values (e.g.
> CP1252).

Yes, two wrong conversions make a right result, However, Cp1252
doesn't always work this way. Cp1252 <-> Unicode mapping table
includes 5 undefined entries. If you pass 0x81, for example, to byte
to char converter, it is converted to U+fffd (REPLACEMENT CHARACTER)
and the round trip is not possible. Only ISO-8859-1 is the safe, round
trippable encoding as far as I know.

Shigemichi Yazawa
Received on Friday, 19 October 2001 12:46:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:45 UTC