W3C home > Mailing lists > Public > whatwg@whatwg.org > March 2008

[whatwg] A comment to character encoding declaration

From: Alexey Proskuryakov <ap@webkit.org>
Date: Fri, 7 Mar 2008 10:57:42 +0300
Message-ID: <2D279601-84A2-4A85-BEB0-91C663074AC0@webkit.org>

On Mar 3, 2008, at 6:11 PM, Jjgod Jiang wrote:

> in their header, yet they might use characters in GBK but
> not in GB2312. So, I think we can suggest clients to simply
> treat encodings like these as their biggest superset, for
> instance, treat GB2312 as GB18030.
>
> BTW, browsers like Firefox seems already handles such cases
> well, but Safari/WebKit seems not.


   In my testing, it appears that IE 7 and Firefox 2 do treat GBK as  
an equivalent of GB2312, but this cannot be said about GB18030. In  
particular, 0x80 and 0xA2E3 are treated differently.

   See:
<http://nypop.com/~ap/webkit/gbk.html>
<http://nypop.com/~ap/webkit/gb18030.html>

   What differences are you seeing between Firefox and WebKit? It  
seems that the behavior may be a bit more tricky than just treating  
all encodings from GBK family as GB18030.

- WBR, Alexey Proskuryakov
Received on Thursday, 6 March 2008 23:57:42 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:40 UTC