Re: [whatwg/encoding] Should not prepend gb18030 second and third when range decode fails upon fourth (#110)

@jungshik any thoughts with regards to ICU?

For four byte sequences we're dealing with this:

1. Non-ASCII
2. ASCII digit
3. Non-ASCII
4. ASCII digit

Now if 4 is not an ASCII digit it sounds like @hsivonen is suggesting it's reasonable to unwind, U+FFFD 1, and reprocess from 2.

If 4 is an ASCII digit, but there's no code point, the question is what to do then:

1. U+FFFD 1-2-3 and create a theoretical issue for formats where ASCII digits are delimiters of sorts. 2. U+FFFD 1 and 3, and emit 2 and 4 (in their original order).

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/110#issuecomment-300745721

Received on Thursday, 11 May 2017 10:11:44 UTC