- From: Paul Deuter <Paul.Deuter@plumtree.com>
- Date: Mon, 1 Oct 2001 12:47:44 -0700
- To: "Timothy Greenwood" <tgreenwood@openmarket.com>, "souravm" <souravm@infy.com>, <www-international@w3.org>
Numeric character references? What are those? Every encoding can be viewed as numbers. A sniffer will just show you the octets that went over the wire. It is up to you to interpret those octets. Therefore, if you run the test below, you must be careful to pay attention to what chars you enter and know how those chars are encoded in the various encodings that might be used. For example, if you type in a Japanese character, you might want to know how that char is encoded in Shift-JIS, in UTF-8, and in UCS-2. Then when you look at the sniffer trace and you see a certain sequence of octets, you can tell right away what encoding was used. That is important because in JSP you must know the encoding in order to re-interpret the bytes properly when calling getParameter. -Paul Paul Deuter Internationalization Manager Plumtree Software paul.deuter@plumtree.com -----Original Message----- From: Timothy Greenwood [mailto:tgreenwood@openmarket.com] Sent: Monday, October 01, 2001 12:31 PM To: 'souravm'; www-international@w3.org Subject: RE: ISO-8859-1 In testing our product I found that with Internet Explorer I could enter characters outside the declared charset. IE translated them into numeric character references. So everything is legal, the output characters are all Latin1 (ASCII even), but are correctly translated by the browser. Does a view source of the resulting page show NCR? - Tim -----Original Message----- From: souravm [mailto:souravm@infy.com] Sent: Monday, October 01, 2001 8:42 AM To: www-international@w3.org Subject: ISO-8859-1 Hi , Here is a small jsp code which I used for proof of concept for a multi lingual project. The interesting observation is that even if I put ISO-8859-1 as charset in the meta tag it works for all languages. I tested it for Japanese, Korean, Arabic and French (using IME on Windows 2000). As far as I know ISO-8859-1 is supposed to cover only western european languages. I'm suprised to find that it even supports the Asian languages. Can anyone please explain me how can it support the Asian language ? Regards, Sourav ------------------------------------------------------------------------ ---------------------------------- The jsp file name is i18na.jsp <%@ page import="java.util.*"%> <%@ page import="java.io.*"%> <% String ucStr = request.getParameter("jap"); %> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> <TITLE></TITLE> </HEAD> <BODY topmargin="0" marginheight="0" leftmargin="0" marginwidth="0"> String = <%= ucStr%> <FORM name="frmText" action="http://192.168.119.15:5052/NASApp/fortune/i18na.jsp" method="post"> <TABLE border="0" cellspacing="0" cellpadding="5" width="200"> <TR> <TD><INPUT TYPE="text" NAME="jap" SIZE="30" value=""></TD> <TD><INPUT TYPE="submit" NAME="Submit" VALUE="button"></TD> </TR> <TR> </TR> </TABLE> </FORM> </BODY> </HTML>
Received on Monday, 1 October 2001 15:47:07 UTC