Re: [whatwg/encoding] Consider adding TextEncoder.containsLoneSurrogates() static (#174)

> @RReverser I was thinking of a different suggestion instead: adding fatal option to TextEncoder.

I think that would be a good thing to have, but that's orthogonal to this, because a fatal option will not help us.

We want to completely *ignore* strings which contain unpaired surrogates, so we definitely don't want runtime exceptions!

> Otherwise with something like containsLoneSurrogates one would have to go through the string twice, which might have undesirable performance effects on large strings.

Indeed, I agree that is very unfortunate.

But for our use case we still wouldn't want `fatal`, instead we would want something else, such as an API which returns the encoded `Uint8Array` (if the string is valid) or `null` (if the string contains unpaired surrogates). That would avoid the double iteration.

----

> @hsivonen So far, experience with the Servo style engine in particular suggests that the Web, despite Hyrum's Law generally ruling everything, is remarkably free of relying on unpaired surrogates.

Web code may not rely upon unpaired surrogates per se, but they definitely rely on the ability for invalid JS strings to roundtrip correctly (e.g. the double `input` event bug).

Unfortunately when encoding to UTF-8 you lose the ability to roundtrip.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/174#issuecomment-479223616

Received on Tuesday, 2 April 2019 22:01:11 UTC