- From: Dr. Ken Lunde <notifications@github.com>
- Date: Mon, 20 Mar 2017 12:23:30 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
Received on Monday, 20 March 2017 19:24:18 UTC
@hsivonen: It is a bit premature to know exactly what changes to the legacy encoding will change in the forthcoming GB 18030 update. Consider a couple prototypical examples from the 24 characters that currently map to PUA code points: 0xA6D9 currently maps to U+E78D, but the non-PUA equivalent is U+FE10. The GB 18030-2005 standard indicates that U+FE10 corresponds to 0x84318236. 0xFE51 currently maps to U+E816, but the non-PUA equivalent is U+20087. The GB 18030-2005 standard indicates that U+20087 corresponds to 0x95329031. The mapping for one of the characters in GB 18030-2000 was changed in the 2005 update, which gives us a glimpse about what is likely to change in the forthcoming update: 0xA8BC originally mapped to U+E7C7, but the 2005 update changed the mapping to U+1E3F, which originally mapped from 0x8135F437. 0x8135F437 now maps to U+E7C7. Following this precedent, I would expect the two examples to be changed change to the following: 0xA6D9 → U+FE10 0x84318236 → U+E78D 0xFE51 → U+20087 0x95329031 → U+E816 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/27#issuecomment-287869910
Received on Monday, 20 March 2017 19:24:18 UTC