- From: Sinha, Raj (Raj) <rajsinha@avaya.com>
- Date: Tue, 22 Jul 2003 11:22:46 -0400
- To: <www-international@w3.org>
- Message-ID: <8CA1128D59AD27429985B397118CEDDF0120895D@nj7460avexu1.global.avaya.com>
Posting a response to this question that was replied only to me ... don't know why. so forwarding it on behalf of Chris Raj Sinha Raj, The only response encoding you can rely on is from browser HTTP PUT commands, where one of the headers tells the server what encoding has been used.. The encoding used in HTTP GET is undefined in the standards for characters outside of 7-bit ASCII. Anyone who says it is the encoding of the page is correct but misleading, as the browser's user can manually decide what that encoding is (changing whatever was declared in the transmitted page), so a web server can have no certainty about the encoding used in the %hh escapes in a GET, which is how non-ASCII is sent. You might find the following helpful: http://jetty.mortbay.com/jetty/doc/international.html My advice: never use GET for sending a form containing international characters, unless its absolutely unavoidable. When using PUT, use the header to find out what encoding was used. If you are using Sun Servlets, the servlet container does all the decoding for you and delivers the 16-bit Unicode characters to your program. Chris Haynes ----- Original Message ----- From: Sinha, Raj (Raj) <mailto:rajsinha@avaya.com> To: www-international@w3.org Sent: Monday, July 21, 2003 6:31 PM Subject: what should the charset be in the response to the server Ok this is a very basic question but I cannot seem to find a clear answer anywhere I wrote a simple web browser which is on it way of becoming internationalized (UTF 8 support etc). What I fail to understand is this: I. The web browser receives a page in lets say utf 8. It then converts everything to utf16 (which is its internal choice of data representation). what charset should the response to the server be. I would guess the response should be in the original charset I,e utf8. Consider this scenario: Browser request for a page indicating its preferences through Accept-charset header Server sends back a page with content type = utf 8 Browser parses the age and converts everything into utf 16. If there is a form the user can enter data into it... which is again converted to the internal choice of utf 16 The browser is ready to send the form results back to the server...??? what should the encoding be here Thanks for any help raj
Received on Tuesday, 22 July 2003 11:22:47 UTC