- From: Christopher Foo <notifications@github.com>
- Date: Wed, 13 Jun 2018 21:05:20 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
Received on Thursday, 14 June 2018 04:05:42 UTC
https://encoding.spec.whatwg.org/commit-snapshots/b04091a5f079a7bdcab5aa8c7adead554326a96c/#gb18030-decoder > If gb18030 third is not 0x00, then: > > If byte is not in the range 0x30 to 0x39, inclusive, then: > > Prepend gb18030 second, gb18030 third, and byte to stream\. > > Set gb18030 first, gb18030 second, and gb18030 third to 0x00\. > > Return error\. > > Let code point be the index gb18030 ranges code point for \(\(gb18030 first − 0x81\) × \(10 × 126 × 10\)\) \+ \(\(gb18030 second − 0x30\) × \(10 × 126\)\) \+ \(\(gb18030 third − 0x81\) × 10\) \+ byte − 0x30\. > > If code point is null, return error\. > > Return a code point whose value is code point\. > > I'm having trouble understanding how, after the last step above, the decoder will accept the next byte correctly. Because `gb18030 first`/`gb18030 second`/`gb18030 third` is not 0x00 after this last step, it seems to enter the wrong steps for subsequent bytes. For example, if I have the byte sequence in hex `20 81 40 84 31 83 30`, decoding it will result in ` 丂︔�` (error at the end) but the expected is ` 丂︔`. I think "set `gb18030 first`, `gb18030 second`, and `gb18030 third` to 0x00" before returning error or code point is missing? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/146
Received on Thursday, 14 June 2018 04:05:42 UTC