Re: [whatwg/encoding] Reflect changes in GB 18030-2022 (Issue #312)

These are the Unicode recommendations:
```
# GB 18030 & Unicode Standard Transcoding Recommendations
#
# Transcoding Between GB 18030 & Unicode Standard
0xA6D9 <-> U+FE10
0xA6DA <-> U+FE12
0xA6DB <-> U+FE11
0xA6DC <-> U+FE13
0xA6DD <-> U+FE14
0xA6DE <-> U+FE15
0xA6DF <-> U+FE16
0xA6EC <-> U+FE17
0xA6ED <-> U+FE18
0xA6F3 <-> U+FE19
0xFE59 <-> U+9FB4
0xFE61 <-> U+9FB5
0xFE66 <-> U+9FB6
0xFE67 <-> U+9FB7
0xFE6D <-> U+9FB8
0xFE7E <-> U+9FB9
0xFE90 <-> U+9FBA
0xFEA0 <-> U+9FBB
#
# Transcoding From Unicode Standard to GB 18030
U+E78D -> 0xA6D9
U+E78E -> 0xA6DA
U+E78F -> 0xA6DB
U+E790 -> 0xA6DC
U+E791 -> 0xA6DD
U+E792 -> 0xA6DE
U+E793 -> 0xA6DF
U+E794 -> 0xA6EC
U+E795 -> 0xA6ED
U+E796 -> 0xA6F3
U+E81E -> 0xFE59
U+E826 -> 0xFE61
U+E82B -> 0xFE66
U+E82C -> 0xFE67
U+E832 -> 0xFE6D
U+E843 -> 0xFE7E
U+E854 -> 0xFE90
U+E864 -> 0xFEA0
#
# Transcoding From GB 18030 to Unicode Standard
0x82359037 -> U+9FB4
0x82359038 -> U+9FB5
0x82359039 -> U+9FB6
0x82359130 -> U+9FB7
0x82359131 -> U+9FB8
0x82359132 -> U+9FB9
0x82359133 -> U+9FBA
0x82359134 -> U+9FBB
0x84318236 -> U+FE10
0x84318237 -> U+FE11
0x84318238 -> U+FE12
0x84318239 -> U+FE13
0x84318330 -> U+FE14
0x84318331 -> U+FE15
0x84318332 -> U+FE16
0x84318333 -> U+FE17
0x84318334 -> U+FE18
0x84318335 -> U+FE19
#
# EOF
```
Notably these only impact gb18030, not GBK. So I don't think we want to change index-gb18030 in the Encoding Standard. Although we might want to update the note about it reflecting GB18030-2005 in some way?

Instead we would have to directly patch the gb18030 encoder and decoder. Albeit likely somewhat ugly that does not seem too bad. I think I would also duplicate the code points that can be transcoded in two directions for simplicity, although we could also create a mini index for them.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/312#issuecomment-2354764499
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/issues/312/2354764499@github.com>

Received on Tuesday, 17 September 2024 07:34:56 UTC