- From: aphillips <notifications@github.com>
- Date: Tue, 06 Sep 2016 10:11:05 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
Received on Tuesday, 6 September 2016 17:33:55 UTC
The GB18030 mapping is naturally fungible wrt PUA characters, since Unicode continues to encode Chinese code points. I think this should be recognized by Encoding. I agree that we should not remove mapping of Unicode PUA -> GB18030 (compatibility). But the problem here is round-tripping of real Unicode code points with GB18030. If I have a U+20087, convert it to GB18030, and the later reserialize the GB data as UTF-8, I will get back U+E816 rather than the original (and correct) code point. That's undesirable and a loss of information. The fact that existing implementations haven't caught up with standardization doesn't mean that we shouldn't make this change. @annevk Under what circumstances would we change? One of the problems with establishing a standard is that implementations are trying hard to be compliant with it... -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/27#issuecomment-245020644
Received on Tuesday, 6 September 2016 17:33:55 UTC