> Entering as input the character Unicode u9b75, the browser (IE6) encodes > it to EUC-JP as FCE4 (%FC%E4 in the URL). Hmpff.. Here I have problem: > The server transcodes it to uFFFD ("REPLACEMENT CHARACTER”). I checked > it manually and got the same resultes: Java (JDK 1.4.2) does no > recognize this character, but IE6 does. I wrote manually a XML file > encoded in EUC-JP with those characters: IE6 transcoded it to u9b75 and > Java transcoded it to uFFFD. \u9b75 is not a character that can be found in EUC-JP proper. 0xFC 0xE4 is not assigned in EUC-JP, and many implementations use this and other unassigned code points as user-defined character (UDC) area. I guess IE (or Windows in general?) somehow tries to preserve this non-EUC character using the UDC area. -- KUROSAKA ("Kuro") Teruhiko San Francisco, California, USA http://www.sonic.net/~kuro/Received on Saturday, 3 April 2004 23:49:35 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 14 August 2008 18:35:18 GMT