[Bug 28740] GB18030-2000 and GB18030-2005 : Decide what to do about their differences, especially PUA codepoints in GB18030-2000

https://www.w3.org/Bugs/Public/show_bug.cgi?id=28740

--- Comment #5 from Masatoshi Kimura <VYV03354@nifty.ne.jp> ---
(In reply to Jungshik Shin from comment #1)
> Created attachment 1612 [details]
> GB18030-2000 vs GB18030-2005 : PUA =>regular
> 
> The attachment lists all the  PUA code points for which Simsun (font on
> Windows) have glyphs.
> 
> The first column is GB18030 byte sequences (2-byte). The second is
> GB18030-2000 Unicode mapping (PUA) and the third is GB18030-2005 (presumably
> if glibc's iconv is correct [1] ) Unicode mapping (non-PUA). 
> 
> Simsun have glyphs for PUA code points, but it does not cover regular
> non-PUA code points (3rd column). 
> 
> A new Simplfiied Chinese font on Windows (Microsoft Yahei) does cover
> non-PUA code points (3rd column) while it does not cover PUA code points
> (2nd column). 
> 
> 
> [1] At least for U+FE10 .. U+FE19, it's very likely that it's correct. Those
> characters were added to Unicode 4.1 in March 2005 (see
> http://unicode.org/cldr/utility/list-unicodeset.
> jsp?a=\p{subhead=Glyphs%20for%20vertical%20variants} ).

No, the GB 18030-2005 standard did NOT change those mappings. Glibc is wrong.
The only change between GB 18030-2005 and GB 18030-2000 is swapping a mapping
for LATIN SMALL LETTER M WITH ACUTE. Here is the table E.2 taken and translated
from the standard:
> GB 18030   -2005  -2000
> 0xA8BC     U+1E3F U+E7C7
> 0x8135F437 U+E7C7 U+1E3F

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Wednesday, 19 August 2015 14:51:19 UTC