W3C home > Mailing lists > Public > www-international@w3.org > July to September 2004

RE: JSP page directive contentType overriden by Apache tomcat?

From: Addison Phillips [wM] <aphillips@webmethods.com>
Date: Fri, 16 Jul 2004 14:55:10 -0700
To: "Jungshik Shin" <jshin@i18nl10n.com>, <www-international@w3.org>
Message-ID: <PNEHIBAMBMLHDMJDDFLHGEBFIFAA.aphillips@webmethods.com>

Did you set the locale via a taglib directive? That'll do it every time.

See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4880792 and others. In fact, if you call request.setLocale you are setting the encoding and it overrides contentType, etc. You don't have to do it directly in the page either: if you do it in a taglib you'll find that problem.

Sun has fixed this in the latest-and-greatest version: now if you set contentType, that takes precedence over setLocale.

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
http://www.webMethods.com
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International

Internationalization is an architecture. 
It is not a feature.

> -----Original Message-----
> From: www-international-request@w3.org
> [mailto:www-international-request@w3.org]On Behalf Of Jungshik Shin
> Sent: 2004年7月16日 14:25
> To: www-international@w3.org
> Subject: JSP page directive contentType overriden by Apache tomcat? 
> 
> 
> 
> Hi,
> 
> I've been wrestling with a mysterious problem for the last few hours. I 
> made a patch to
> the web search front-end of  'Nutch' (http://www.nutch.org  an open 
> source search engine
> that strives be an open source google [1]) so that query strings made of 
> characters outside ISO-8859-1
> character repertoire can work.
> 
> Following the standard-step of adding contentType and pageEncoding 
> directives at the beginning
> of jsp files (I also added request.setCharacterEncoding("UTF-8"); along 
> with making sure that
> that's honored because recent versions of Apache tomcat by default 
> ignores that for GET),
> I expected everything to work. To my great surprise, all the JSP 
> files with
> 'contentType="text/html; charset=UTF-8"' directive still emit 
> 'Content-Type:text/html; charset=ISO-8859-1'
> in HTTP header. Even more surprsing is that cached versions of 
> translated java source files for
> those jsp files have the following line:
> 
> response.setContenttype("text/html; charset=UTF-8");
> 
> It's completely beyond me how I've been getting  'text/html; 
> charset=ISO-8859-1' despite that.
> 
> You can try it at http://pippin.kaist.ac.kr:8080. I ran nutch crawler to 
> fetch a small number (about
> 4000) of pages in several different scripts (if you give '1234' as a 
> query, you'll get 4 hits). The
> search result page(handled by search.jsp) is supposed to be in UTF-8 
> with the correct C-T header
> emitted in HTTP header. 
> 
> Is there anyone who's been beaten by this bizzare problem? It'd be great 
> to know how that was solved.
> 
> Thank you tons in advance,
> 
> Jungshik
> 
> 
> [1] Needless to say, there are a number of things to improve in I18N as 
> well as in other aspects before Nutch can compete with Google.
> 
> 
Received on Friday, 16 July 2004 17:58:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:03 GMT