- From: Michael Jansson <mjan@em2-solutions.com>
- Date: Wed, 23 Jul 2003 08:49:41 +0200
- To: "'Sinha, Raj (Raj)'" <rajsinha@avaya.com>, www-international@w3.org
- Message-ID: <CFDB95B7A60B714698C8E065A0759B0D1B68@gateway.em2-solutions.com>
I think Chris meant to say HTTP *POST* (HTTP PUT is something else). The remark is important though. HTTP POST is to be preferred over HTTP GET. HTTP GET product url encoded data that can be ambiguous for certain browsers and certain charsets. Regards, em2 Solutions Michael Jansson -----Original Message----- From: Sinha, Raj (Raj) [mailto:rajsinha@avaya.com] Sent: Tuesday, July 22, 2003 5:23 PM To: www-international@w3.org Subject: FW: what should the charset be in the response to the server Posting a response to this question that was replied only to me ... don't know why. so forwarding it on behalf of Chris Raj Sinha Raj, The only response encoding you can rely on is from browser HTTP PUT commands, where one of the headers tells the server what encoding has been used.. The encoding used in HTTP GET is undefined in the standards for characters outside of 7-bit ASCII. Anyone who says it is the encoding of the page is correct but misleading, as the browser's user can manually decide what that encoding is (changing whatever was declared in the transmitted page), so a web server can have no certainty about the encoding used in the %hh escapes in a GET, which is how non-ASCII is sent. You might find the following helpful: http://jetty.mortbay.com/jetty/doc/international.html <http://jetty.mortbay.com/jetty/doc/international.html> My advice: never use GET for sending a form containing international characters, unless its absolutely unavoidable. When using PUT, use the header to find out what encoding was used. If you are using Sun Servlets, the servlet container does all the decoding for you and delivers the 16-bit Unicode characters to your program. Chris Haynes ----- Original Message ----- From: Sinha, Raj <mailto:rajsinha@avaya.com> (Raj) To: www-international@w3.org <mailto:www-international@w3.org> Sent: Monday, July 21, 2003 6:31 PM Subject: what should the charset be in the response to the server Ok this is a very basic question but I cannot seem to find a clear answer anywhere I wrote a simple web browser which is on it way of becoming internationalized (UTF 8 support etc). What I fail to understand is this: I. The web browser receives a page in lets say utf 8. It then converts everything to utf16 (which is its internal choice of data representation). what charset should the response to the server be. I would guess the response should be in the original charset I,e utf8. Consider this scenario: Browser request for a page indicating its preferences through Accept-charset header Server sends back a page with content type = utf 8 Browser parses the age and converts everything into utf 16. If there is a form the user can enter data into it... which is again converted to the internal choice of utf 16 The browser is ready to send the form results back to the server...??? what should the encoding be here Thanks for any help raj
Received on Wednesday, 23 July 2003 02:49:42 UTC