W3C home > Mailing lists > Public > www-international@w3.org > January to March 2015

[Bug 28156] Separate GBK and GB18030 even for decoding (toUnicode)

From: <bugzilla@jessica.w3.org>
Date: Thu, 12 Mar 2015 11:50:29 +0000
To: www-international@w3.org
Message-ID: <bug-28156-4285-imjldCHsX9@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28156

Anne <annevk@annevk.nl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hsivonen@hsivonen.fi,
                   |                            |smontagu@smontagu.org

--- Comment #1 from Anne <annevk@annevk.nl> ---
I would have expected that treating them identically for decoding saves you a
decoding table. Or would you reuse that anyway?

They're treated identically because gbk is effectively a subset and for the
other encodings we've found that supersets leak. I think there might be some
anecdotal evidence here too, but not sure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Received on Thursday, 12 March 2015 11:50:30 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:38 UTC