W3C home > Mailing lists > Public > public-i18n-geo@w3.org > November 2003

JSP containers and default charset (was: Re DefaultCharset considered harmful)

From: Jungshik Shin <jshin@i18nl10n.com>
Date: Sun, 30 Nov 2003 22:49:03 +0900 (KST)
To: Martin Duerst <duerst@w3.org>
Cc: public-i18n-geo@w3.org
Message-ID: <Pine.LNX.4.58.0311302222450.4579@jshin.net>

On Thu, 25 Sep 2003, Martin Duerst wrote:

> Regarding the configuration problems with Apache, I
> think the main culprit is the configuration file httpd.conf,
> as shipped with the distribution. This contains:
> # Specify a default charset for all pages sent out. This is
> # always a good idea and opens the door for future internationalisation

Recently I found that Apache-tomcat always adds 'charset=ISO-8859-1'
(to virtually all Content-Type headers whether textual or not) unless
it's explicitly overriden in the JSP with either of the following lines.

<%@ page pageEncoding="CHARSET" %>

<%@ page contentType="CONTENT-TYPE; charset=CHARSET" %>

So, including the following line in a JSP without either of the above
leads to a conflict:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

It seems like the JSP specification mandates this behavior so that it's
not just Jakarta Tomcat. I'm not sure whether this is good or bad.
Certainly JSP offers a  way to specify the character encoding of
individual pages (it took me a while to find that out [1]), but someone who
doesn't know that (and who believes that adding meta tag would work)
may be taken by surprise.

Do we have to consider asking those in charge of the JSP specification
to change it so that by default NO charset parameter is added?


[1] I should have turned to
BTW, the second method of setting the page encoding in JSP is not mentioned
in the above tip.

<%@ page pageEncoding="CHARSET" %>
Received on Sunday, 30 November 2003 08:50:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:03:12 UTC