Re: [whatwg/encoding] Make TextEncoder and TextDecoder be transform streams (#127)

> Since it's a common enough request, would it be possible to include an example in the spec itself showing how to use streams to encode into an existing ArrayBuffer?

Sadly, bring-your-own-buffer style usage isn't supported by TransformStream yet. There's an open question about what to do when there isn't enough space in the supplied ArrayBuffer to fit the transformed input.

Since this is basically an optimisation feature, I don't plan to support it in the first release. I would rather take my time to get it right.

> Can the tests be upgraded to use async/await ?

Certainly. I'm never quite sure what version of ES to target in tests. For the Streams Standard tests I've been avoiding introducing async/await, but I'm happy to use them here.

> In the tests, stream-properties.html should probably just be an idlharness test. (I don't know how well that plays with p(r)ollyfills, though.)

I haven't tried it. I will give it a go.

> What's the behavior for split surrogate pairs when encoding? We basically pretend that's not a problem in the non-streaming API, by making the input is a USVString and thus not needing a {stream} flag. In other words, we assume that the input is always a complete text, not just a randomly partitioned sequence of code units. In this patch, the text reads Let input be the result of converting chunk to a USVString.... which sounds like it might introduce problems when chunk sizes cross surrogate boundaries. This worries me. Also, needs test cases.

Yes, it's broken. I've been planning to write a test for this at some point. What I expect to see is if you encode "😻" in a single chunk, it works as intended. If you split it into two chunks, you get two chunks as output each containing U+FFFD encoded as UTF-8.

If there is consensus that this should be fixed, I think it can be done by converting to DOMString rather than USVString and special-casing the final code unit when it is the first half of a surrogate pair.


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/pull/127#issuecomment-350994136

Received on Tuesday, 12 December 2017 09:27:39 UTC