- From: Mattias Buelens <notifications@github.com>
- Date: Wed, 02 Mar 2022 01:00:26 -0800
- To: whatwg/encoding <encoding@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/encoding/issues/283/1056616763@github.com>
I think this is intentional. Readable byte streams also don't allow writing empty chunks, and it's likely that we'll want to make `TextDecoderStream.readable` a proper byte stream in the future: ```javascript const rs = new ReadableStream({ type: "bytes", start(controller) { controller.enqueue(new Uint8Array(0)); // throws } }); ``` I'm a bit surprised by Deno's `LineStream` design. I would expect a transform stream that splits text by line delimiters would accept *strings* as input and produce *strings* as output. Instead, it looks like it uses raw byte chunks as both input and output? That means that `LineStream` is making an assumption about the text encoding, right? How exactly is that supposed to deal with multi-byte text encodings like `utf-16`? For example: ```javascript new TextDecoder("utf-16").decode(new Uint8Array([0x41, 0x00, 0x0A, 0x00, 0x42, 0x00])); // -> "A\nB" ``` I would expect you *first* run these chunks through a `TextDecoderStream`, and *then* split by line delimiters: ```javascript const readable = new ReadableStream({ start(controller) { controller.enqueue(new Uint8Array([0x41, 0x00, 0x0A, 0x00, 0x42, 0x00])); controller.close(); } }); readable .pipeThrough(new TextDecoderStream("utf-16")) .pipeThrough(new LineStream()); // -> stream with chunks "A" and "B" ``` -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/283#issuecomment-1056616763 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/encoding/issues/283/1056616763@github.com>
Received on Wednesday, 2 March 2022 09:00:39 UTC