Re: [whatwg/encoding] Support GB18030-2022 (PR #335)

@annevk commented on this pull request.



> +  <p>If <a for="gb18030 encoder">is GBK</a> is false and there is a row in the table below whose
+  first column is <var>code point</var>, then return the two bytes on the same row listed in the
+  second column:

As long as we layer this on top of the existing gb18030 index, we can remove all the U+EXXX PUA entries below as they are already in that table and map to the correct bytes.

An implementation that updates that table would need these U+EXXX mappings though, as evidenced by WebKit. (Which I assume has a completely separate implementation for GBK.)

With those U+EXXX mappings removed however it probably makes the most sense to create a small "gb18030 2022 index" which can be reused across the decoder and encoder as a special case when GBK is not in use. Bit more work, but overall would present the data more neatly.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/pull/335#pullrequestreview-2310066157
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/pull/335/review/2310066157@github.com>

Received on Tuesday, 17 September 2024 15:13:05 UTC