Re: [whatwg/encoding] Add TextEncoderStream and TextDecoderStream transform streams (#149)

MattiasBuelens requested changes on this pull request.

I think I found a few issues. Quick overview:
* The documentation for `TextDecoderStream.writable` claims that you can enqueue `BufferSource` chunks, but the spec text only allows `ArrayBuffer`s. I believe the spec text should be updated.
* We should avoid enqueuing empty strings (in `TextDecoderStream`) or empty `Uint8Array`s (in `TextEncoderStream`).

> + <dd><p>Returns true if <a for=TextDecoderAttributes>error mode</a> is "<code>fatal</code>", and
+ false otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderAttributes>ignoreBOM</a></code>
+ <dd><p>Returns true if <a for=TextDecoderAttributes>ignore BOM flag</a> is set, and false
+ otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>readable</a></code>
+ <dd>
+  <p>Returns a <a>readable stream</a> whose <a>chunks</a> are strings resulting from running
+  <a for=TextDecoderAttributes>encoding</a>'s <a for=/>decoder</a> on the chunks written to
+  {{GenericTransformStream/writable}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>writable</a></code>
+ <dd>
+  <p>Returns a <a>writable stream</a> which accepts {{BufferSource}} chunks and runs them through

A [`BufferSource`](https://heycam.github.io/webidl/#BufferSource) is a `ArrayBufferView` or an `ArrayBuffer`. However, step 1 of [_decode and enqueue a chunk_](https://whatpr.org/encoding/149.html#concept-tds-decode-and-enqueue) only accepts non-detached non-shared `ArrayBuffer`s:
> 1. If Type(chunk) is not Object, or chunk does not have an [[ArrayBufferData]] internal slot, or IsDetachedBuffer(chunk) is true, or IsSharedArrayBuffer(chunk) is true, then return a new promise rejected with a TypeError exception.

I believe this step should also accept an `ArrayBufferView` chunk that has a non-detached non-shared backing `ArrayBuffer`.

> + <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li><p>Let <var>result</var> be the result of <a>processing</a> <a>end-of-stream</a> for
+ <var>dec</var>'s <a for=TextDecoderStream>decoder</a> and <var>dec</var>'s
+ <a for=TextDecoderStream>stream</a>, <var>output</var>, and <var>dec</var>'s
+ <a for=TextDecoderAttributes>error mode</a>.
+
+ <li><p>If <var>result</var> is <a>finished</a>, then run these steps:
+ <ol>
+  <li><p>Let <var>outputChunk</var> be <var>output</var>,
+  <a lt="serialize stream" for=TextDecoderStream>serialized</a>.
+
+  <li><p>Let <var>controller</var> be <var>dec</var>'s
+  <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+  <li><p>Call <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,

Same as above, skip this step if `outputChunk` is the empty string. In this case, [the prollyfill does handle it correctly](https://github.com/GoogleChromeLabs/text-encode-transform-prollyfill/blob/f5fbf8caa1f9d8979ef2bd6d060f004dd5435247/text-encode-transform.js#L128).

> + <li><p>Let <var>output</var> be a new <a for=/>stream</a>.
+
+ <li>
+  <p>While true, run these steps:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <var>dec</var>'s
+   <a for=TextDecoderStream>stream</a>.
+
+   <li>
+    <p>If <var>token</var> is <a>end-of-stream</a>, then run these steps:
+    <ol>
+     <li><p>Let <var>outputChunk</var> be <var>output</var>,
+     <a lt="serialize stream" for=TextDecoderStream>serialized</a>.
+
+     <li><p>Call <a abstract-op>TransformStreamDefaultControllerEnqueue</a>(<var>controller</var>,

We should skip this step if `outputChunk` is the empty string. For example, this could happen if `chunk` is empty, or if the decoder is in the middle of a multi-byte character.

> + <li><p>Let <var>controller</var> be <var>enc</var>'s
+ <a for=GenericTransformStream>transform</a>.\[[transformStreamController]].
+
+ <li>
+  <p>While true, run these steps:
+
+  <ol>
+   <li><p>Let <var>token</var> be the result of <a>reading</a> from <var>input</var>.
+
+   <li>
+    <p>If <var>token</var> is <a>end-of-stream</a>, then run these steps:
+
+    <ol>
+     <li><p>Convert <var>output</var> into a byte sequence.
+
+     <li><p>Let <var>chunk</var> be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing

Once again, skip this step and the following enqueue when `output` is an empty byte sequence. For example, this could happen if `chunk` is empty, but also if `chunk` consists of one single high surrogate.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/pull/149#pullrequestreview-143380155

Received on Saturday, 4 August 2018 12:33:36 UTC