- From: <bugzilla@jessica.w3.org>
- Date: Tue, 12 May 2015 18:59:41 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28156 --- Comment #3 from Jungshik Shin <jshin@chromium.org> --- (In reply to Anne from comment #1) > I would have expected that treating them identically for decoding saves you > a decoding table. Or would you reuse that anyway? It does not save us anything. Both tables (GBK and GB18030) would have to be shipped. (unlike Mozilla, ICU does not have two separate tables for encoding and decoding). Actually, we need an additional code in Blink [1] to treat encoding and decoding differently for GBK and GB18030 (for toUnicode - identical. for fromUnicode - distinct), which we'd like to avoid if possible. > They're treated identically because gbk is effectively a subset and for the > other encodings we've found that supersets leak. I think there might be some > anecdotal evidence here too, but not sure. As I wrote in the previous comment, I suspect that the extent of "leak" (if any) is much smaller in gbk-gb18030 than other cases. [1] It might be possible to do this in ICU as well, but I don't want to make a patch to ICU (that is hard to upstream because I don't have a good justification). -- You are receiving this mail because: You are on the CC list for the bug.
Received on Tuesday, 12 May 2015 18:59:42 UTC