- From: Leila Schneberger <leila@windchill.com>
- Date: Wed, 03 Dec 1997 10:06:32 -0600
- To: "www-international@w3.org" <www-international@w3.org>
We are building a Java HTTP gateway. We intend to support currently available web servers (Netscape 3 and Microsoft IIS 3) and clients (Netscape 4.0.3 and Explorer 4.0). Our gateway is going to provide file upload/download capability. These files will be stored as Unicode. I have some questions regarding what is currently supported for character encoding. (Sometimes it is difficult to track the difference between the direction the standards are going and current reality ;^) When reading and writing local text files from Java, it is simple to use Java character streams that perform conversions between internal Unicode characters and the external native OS character encoding. However, can anyone tell me what character encoding conventions are used by Netscape and Microsoft Web servers when passing HTTP request and response bodies to servlets or CGI programs? Specifically, some questions I have are: 1.Do HTTP clients (browsers) post request body content in their native encoding and specify the content encoding in one of the requests headers? 2.If so, what is the standard header and its format? 3.If a client posts a request body with a character encoding that is different than the Web servers native encoding, is the body passed to CGI or Servlet processing as-is? That is, as an unaltered byte stream, or does the Web server convert to its own native encoding? 4.If passed as-is, is the standard HTTP header that specifies the encoding of the request's body made available to the processing CGI or Servlet by the Web server? 5.If so, what CGI variable will it be in? 6.Do HTTP servers send response body content in their native encoding and specify the content encoding in one of the response header? If so, what is the standard header and its format? Or... 7.Do HTTP servers send response body content in a client (browser) specified encoding? If so, what is the request header that specifies the possible response encodings? 8.If the HTTP server does conversion to send responses in the requested encoding, what encoding is it expecting for CGI and Servlet produced output? Does it assume CGI and Servlet output is in the servers native encoding and convert it on the fly? Or.. 9.Does the HTTP server expect the CGI or Servlet to read the request headers themselves and produce a byte stream containing the correct encoding? 10.If so, what CGI variable will the acceptable encodings be specified in? 11.Is the CGI or Servlet that is generating the response responsible for setting a content encoding header? 12.Are text content types the only ones that undergo conversion? If not, what are some other MIME types that may require additional character encoding headers? Any information or pointers to where I could get some of these questions answered would be greatly appreciated. Thanks, Leila Schneberger leila@windchill.com
Received on Wednesday, 3 December 1997 11:06:59 UTC