Re: EUC-JP encoding - Bug in IE6?

> Entering as input the character Unicode u9b75, the browser (IE6) encodes 
> it to EUC-JP as FCE4 (%FC%E4 in the URL). Hmpff.. Here I have problem: 
> The server transcodes it to uFFFD ("REPLACEMENT CHARACTER”). I checked 
> it manually and got the same resultes: Java (JDK 1.4.2) does no 
> recognize this character, but IE6 does. I wrote manually a XML file 
> encoded in EUC-JP with those characters: IE6 transcoded it to u9b75 and 
> Java transcoded it to uFFFD.

\u9b75 is not a character that can be found in EUC-JP proper.
0xFC 0xE4 is not assigned in EUC-JP, and many implementations use this
and other unassigned code points as user-defined character (UDC) area.
I guess IE (or Windows in general?) somehow tries to preserve this
non-EUC character using the UDC area.

-- 
KUROSAKA ("Kuro") Teruhiko
San Francisco, California, USA
http://www.sonic.net/~kuro/

Received on Saturday, 3 April 2004 23:49:35 UTC