RE: Servlet question from Shigemichi Yazawa on 2001-10-22 (www-international@w3.org from October to December 2001)

From: Shigemichi Yazawa <yazawa@globalsight.com>
Date: Mon, 22 Oct 2001 10:36:55 -0600
To: Paul.Deuter@plumtree.com
Cc: www-international@w3.org
Message-ID: <5epu7faaxk.wl@globalsight.com>

At Mon, 22 Oct 2001 08:27:10 -0700,
Paul Deuter <Paul.Deuter@plumtree.com> wrote:
> 1.  Both encodings CP1252 and 8859-1 have "holes".  For 8859-1
> the range 80-9F is invalid.  For CP1252, the values 80, 81, 8D,
> 8E, 8F, 90, 9D, and 9E are invalid (according to Kano's book).

As I said in the previous email, there is no hole in 8859_1 (at least
Sun's java). Try running a program I attached to the previous email by
changing the encoding to "8859_1".

> The often suggested method
> for converting characters in the request is to use a line of code
> that looks like this:
> 
> String strParam = new
> String(request.getParameter("SomeName").getBytes("8859_1"), "UTF8");

We have been using this technique, or hack, for a long time and it
works fine. The catch is that it works only when the default encoding
is 8859_1. If the servlet JVM's default encoding is Cp1252 and/or the
servlet is configured to use Cp1252 when converting the parameters, it
fails miserably.

-------------------
Shigemichi Yazawa
yazawa@globalsight.com

Received on Monday, 22 October 2001 12:21:22 UTC