- From: <bugzilla@jessica.w3.org>
- Date: Sun, 20 Apr 2014 04:56:43 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25396 Bug ID: 25396 Summary: Incorrect mapping in index18030.txt Product: WHATWG Version: unspecified Hardware: PC OS: All Status: NEW Severity: normal Priority: P2 Component: Encoding Assignee: annevk@annevk.nl Reporter: ashtuchkin@gmail.com QA Contact: sideshowbarker+encodingspec@gmail.com CC: mike@w3.org, www-international@w3.org Input sequence A3 A0 in GB18030 is decoded as U+E5E5 by iconv and ICU. F.ex. > printf "\xA3\xA0" | iconv -f gb18030 -t utf-16le | hexdump 0000000 e5 e5 ICU table: http://source.icu-project.org/repos/icu/data/trunk/charset/data/xml/gb-18030-2000.xml Using the algorithm given in http://encoding.spec.whatwg.org/#gb18030-encoder, A3 A0 results in pointer 6555, which is mapped to U+3000 IDEOGRAPHIC SPACE in index18030.txt. I believe this mapping incorrect and should be replaced with U+E5E5. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Sunday, 20 April 2014 04:56:44 UTC