Re: [whatwg/encoding] If gb18030 is revised, consider aligning the Encoding Standard (#27)

@hsivonen: It is a bit premature to know exactly what changes to the legacy encoding will change in the forthcoming GB 18030 update.

Consider a couple prototypical examples from the 24 characters that currently map to PUA code points:

0xA6D9 currently maps to U+E78D, but the non-PUA equivalent is U+FE10. The GB 18030-2005 standard indicates that U+FE10 corresponds to 0x84318236.

0xFE51 currently maps to U+E816, but the non-PUA equivalent is U+20087. The GB 18030-2005 standard indicates that U+20087 corresponds to 0x95329031.

The mapping for one of the characters in GB 18030-2000 was changed in the 2005 update, which gives us a glimpse about what is likely to change in the forthcoming update:

0xA8BC originally mapped to U+E7C7, but the 2005 update changed the mapping to U+1E3F, which originally mapped from 0x8135F437. 0x8135F437 now maps to U+E7C7. Following this precedent, I would expect the two examples to be changed change to the following:

0xA6D9 → U+FE10
0x84318236 → U+E78D

0xFE51 → U+20087
0x95329031 → U+E816

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/27#issuecomment-287869910

Received on Monday, 20 March 2017 19:24:18 UTC