- From: Chris Haynes <chris@harvington.org.uk>
- Date: Mon, 28 Jul 2003 19:19:16 +0100
- To: "Shigemichi Yazawa" <yazawa@globalsight.com>, "Jungshik Shin" <jshin@i18nl10n.com>
- Cc: <www-international@w3.org>
"Shigemichi Yazawa" wrote at Monday, July 28, 2003 5:41 PM > > At Fri, 25 Jul 2003 23:39:13 -0400 (EDT), > Jungshik Shin wrote: > > > > Have you tried Mozilla 1.4? Mozilla 1.0 is pretty outdated and a lot > > of features have been added since. > ... > > Have you tried entering some non-ASCII characters? The default (in MIME) > > content-type is 'Content-Type: text/plain; charset=US-ASCII' and it can > > be omitted. If it still does not add 'C-T' header with charset > > parameter for non-ASCII chars, I'll file a bug > > and hopefully fix it. > > I entered a Japanese text and got the same result. No content-type > header. Mozilla 1.4 (for Windows) doesn't put it either. > > I setup a JSP here. Feel free to try this out yourself. > > http://www.runout.org/html-form-test/multi-part-form.jsp > > The input data in this page is written out in ISO-8859-1. So any > non-ASCII string will be shown as garbage. > > I also setup another JSP that adds accept-charset="UTF-8" in FORM > element as Chris suggested. > > http://www.runout.org/html-form-test/accept-charset.jsp > > It seems to work fine even if you change the character encoding in > your browser. This seems to be a effective solution for immediate > needs. > > Even using this technique, you still have to do this old trick. > > new String(request.getParameter("param").getBytes("ISO8859_1"), "UTF-8"); > > That's because the character encoding is specified only in the page > source and not in the HTTP request. This should not be necessary. Your Servlet container should implement the Sun Servlet Spec.(Version 2.3, section 4.9), which says that if you call request.setCharacterEncoding(String) before accessing any of the parameters, the specified encoding is used in decoding the requests' parameters. So you can just call request.setCharacterEncoding("UTF-8"); request.getParameter("param"); You know the request was encoded in UTF-8 because you mandated it in the form's 'accept-charset' attribute - and my conjecture is that this can be trusted. This all works fine with the Jetty container I use http://jetty.mortbay.org . > > It seems that enctype="text/plain" was proposed at some point. If this > scheme is employed and browsers add charset parameter in HTTP request, > the server side API can reliably convert the character encoding. Just > a thought... > > ------------------- > Shigemichi Yazawa > yazawa@globalsight.com > > Chris Haynes
Received on Monday, 28 July 2003 14:28:29 UTC