- From: Pauan <notifications@github.com>
- Date: Mon, 08 Apr 2019 10:00:20 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/encoding/issues/174/480915431@github.com>
> JavaScript doesn't claim UTF-16 compatibility though, so it's not really a bug, but rather part of the language I am very aware of that. That doesn't change the fact that breaking UTF-16 is a *really* bad idea, which is why I said that it is (in my opinion) a bug. And as @hsivonen has said, based on Firefox's experience the web generally doesn't generate or interact with unpaired surrogates (since pretty much every JS API never produces unpaired surrogates). So even though *technically* JS isn't UTF-16, in practice it is, because nobody actually generates unpaired surrogates. > Here the input is perfectly valid UTF-8 / UTF-16 I think that's debatable. Sure, from the perspective of the consumer, before they run `JSON.parse` it appears to be valid UTF-16. But from the perspective of the producer, they had a string which was invalid UTF-16, and then they called `JSON.stringify` (or similar) on it, and sent it to the consumer. So I would say that that is a bug in the producer, since they should have never created an invalid string in the first place. Basically, except in contrived examples, *somebody* messed up and generated an invalid string. And so it's their responsibility to fix that. So I'm asking for non-contrived examples of where unpaired surrogates were generated. > Hence my suggestion to enhance TextEncoder to check for lone surrogates automatically and throw the error before even crossing the boundary or losing the data. Like I said earlier, I think that's a good idea, but it's orthogonal to `TextEncoder.containsLoneSurrogate`, so I think you should advocate for that in a new issue. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/174#issuecomment-480915431
Received on Monday, 8 April 2019 17:00:42 UTC