- From: Jungshik Shin <jshin@i18nl10n.com>
- Date: Sat, 26 Jul 2003 06:31:38 -0400 (EDT)
- To: Chris Haynes <chris@harvington.org.uk>
- cc: <www-international@w3.org>
On Sat, 26 Jul 2003, Chris Haynes wrote: > find that the advice I gave was wrong - I had taken it on trust from > someone else that the Content-Encoding field is set by user agents > when sending a POST which includes encoded content (why isn't it?). Content-Encoding doesn't seem to be a good place for specifying charset. It's not mentioned for _that purpose_ in any standard(or what purpoted to be standard) document. It's for gzip, compress and things like that. It's Content-Type header that should have charset parameter (not just in HTTP but also in MIME). However, as you wrote, most server side tools/aplications have to be updated to handle that. > MSIE 6.0 does support this feature (i.e. the query string includes the > name-value pair "_charset_=UTF-8") > Netscape Navigator 6.2.2 does not > Opera 7.11 does not. Gee. Netscape 6.2.2 (based on pre-1.0 Mozilla) sounds like an ancient relic. To be fair, you may wish to try Mozilla 1.4 or the latest netscape based on Mozilla 1.4. > I conclude that the '_charset_' mechanism, although ingenious, is a > non-standard, proprietary distraction. I agree. > The only standards-based way of being _sure_ what character encoding > has been applied to form data appears to be to use > > <form action=... method='post' accept-charset='UTF-8'> > > (or whatever the page author's chosen character encoding is). I don't know what Lynx and w3m-m17n do in this case. They're important for some people with accessibility problems. I'll check them out. > Interestingly, changing from 'post' to 'get' in MSIE6 re-enables the > user's control over the encoding used (i.e. over the values now > transmitted in the URI query string). I have not tested this with the > other two browsers. It also depends on whether or not you set 'send URLs always in UTF-8' in Tools|Options(?) in MS IE. > The only other reliable transfer mechanism available would appear to > be the ENCTYPE="multipart/form-data" method being discussed in this > same thread, but this format is not decoded by standard Servlet > containers, so the convenient HttpRequest.getParameter() Servlet API > could not be used with this mechanism. Can this API be updated to decipher charset parameter present in C-T header fields of subparts of multipart/form-data? HTML 4.x was released in 1999(?) and ..... Jungshik
Received on Saturday, 26 July 2003 06:31:54 UTC