Re: [whatwg/encoding] TextDecoderStream: empty Uint8Array should result in an empty string (Issue #283)

I think this is intentional. Readable byte streams also don't allow writing empty chunks, and it's likely that we'll want to make `TextDecoderStream.readable` a proper byte stream in the future:
```javascript
const rs = new ReadableStream({
  type: "bytes",
  start(controller) {
    controller.enqueue(new Uint8Array(0)); // throws
  }
});
```
I'm a bit surprised by Deno's `LineStream` design. I would expect a transform stream that splits text by line delimiters would accept *strings* as input and produce *strings* as output. Instead, it looks like it uses raw byte chunks as both input and output?

That means that `LineStream` is making an assumption about the text encoding, right? How exactly is that supposed to deal with multi-byte text encodings like `utf-16`? For example:
```javascript
new TextDecoder("utf-16").decode(new Uint8Array([0x41, 0x00, 0x0A, 0x00, 0x42, 0x00]));
// -> "A\nB"
```
I would expect you *first* run these chunks through a `TextDecoderStream`, and *then* split by line delimiters:
```javascript
const readable = new ReadableStream({
  start(controller) {
    controller.enqueue(new Uint8Array([0x41, 0x00, 0x0A, 0x00, 0x42, 0x00]));
    controller.close();
  }
});

readable
  .pipeThrough(new TextDecoderStream("utf-16"))
  .pipeThrough(new LineStream());
// -> stream with chunks "A" and "B"
```

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/283#issuecomment-1056616763
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/issues/283/1056616763@github.com>

Received on Wednesday, 2 March 2022 09:00:39 UTC