[whatwg] iso-2022-jp and octets over 0x7E from Anne van Kesteren on 2012-01-08 (public-whatwg-archive@w3.org from January 2012)

From: Anne van Kesteren <annevk@opera.com>
Date: Sun, 08 Jan 2012 20:49:56 +0100
Message-ID: <op.v7r6pit064w2qv@annevk-macbookpro.local>

On Sun, 08 Jan 2012 15:32:47 +0100, Anne van Kesteren <annevk at opera.com>  
wrote:
> On Sun, 08 Jan 2012 01:37:14 +0100, NARUSE, Yui <naruse at airemix.jp>  
> wrote:
>> == iso-2022-jp
>> === The to Unicode algorithm
>> ==== Based on iso-2022-jp state
>> ===== ASCII state
>> ====== Based on octet:
>> ======= Otherwise
>>> If the fatal flag is set, return failure.
>>> Otherwise, emit the fallback code point.
>>
>> Just FYI, IE and Opera show these bytes as Katakana.
>> If octet is greater than 0xA0 and less than 0xE0, value is octet +  
>> 0xFEC0.
>>
>> Moreover IE shows any shift_jis characters here.
>> It seems that IE uses the same converter both iso-2022-jp and shift_jis.
>
> I have filed a bug on Opera to become more strict like Webkit/Gecko. If  
> there is some evidence that approach is wrong though, we can turn it  
> around.

So just to be sure I checked again and in Opera you can only get the  
"special" single-octet behavior if you active a particular state first. If  
you are in ASCII, Opera will simply emit the octet unless it is 0x1B (ESC)  
so maybe there is a system font that does something special for those  
characters? Or maybe you meant something else?


-- 
Anne van Kesteren
http://annevankesteren.nl/

Received on Sunday, 8 January 2012 11:49:56 UTC