- From: Domenic Denicola <notifications@github.com>
- Date: Tue, 31 Jul 2018 09:19:35 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/encoding/pull/149/review/142027900@github.com>
domenic commented on this pull request. So sorry on the delay on this. Will be more responsive in the future. I left editorial comments about spec organization. The algorithms seem solid to me from the streams perspective, but I am not an expert on the encoding side of things. Overall this looks pretty good. Hopefully @annevk can review as well. > @@ -15,6 +15,19 @@ Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeo spec:infra; type:dfn; text:code point text:ascii case-insensitive +spec:streams; + type:interface; text:ReadableStream + type:dfn; text:chunk We should export these in streams so that this is not necessary, for the dfns at least. For the interface, that's a separate issue: https://github.com/whatwg/fetch/issues/780 so we can leave it here for now. > + +<pre class=idl> +interface mixin TextDecoderAttributes { + readonly attribute DOMString encoding; + readonly attribute boolean fatal; + readonly attribute boolean ignoreBOM; +}; +</pre> + +<p>The {{TextDecoderAttributes}} mixin defines common attributes that are shared between +{{TextDecoder}} and {{TextDecoderStream}}. These objects have an associated +<dfn for=TextDecoderAttributes>encoding</dfn>, <dfn for=TextDecoderAttributes>ignore BOM flag</dfn>, +and <dfn for=TextDecoderAttributes>error mode</dfn>. + +<p>The <dfn attribute for=TextDecoderAttributes><code>encoding</code></dfn> attribute's getter must +return <a for=TextDecoderAttributes>encoding</a>'s <a for=encoding>name</a> in <a>ASCII While moving things around maybe use the modern phrasing "this object's encoding's" or "this TextDecoderAttribute object's encoding's". Similarly for the rest. > @@ -1038,6 +1051,33 @@ function decodeArrayOfStrings(buffer, encoding) { </div> +<h3 id=interface-mixin-textdecoderattributes>Interface Mixin {{TextDecoderAttributes}}</h3> Nit: lowercase "mixin" in headings, here and below. (Headings are sentence case.) > @@ -1305,6 +1350,404 @@ must run these steps: </ol> +<h3 id=interface-mixin-generictransformstream>Interface Mixin {{GenericTransformStream}}</h3> + +<p>{{GenericTransformStream}} represents the concept of a <a>transform stream</a> in IDL. It is not +a {{TransformStream}}, though it has the same interface and it delegates to one. + +<pre class=idl> +interface mixin GenericTransformStream { + readonly attribute ReadableStream readable; + readonly attribute WritableStream writable; +}; +</pre> + +<p>An object that includes GenericTransformStream always has an associated +<dfn for=GenericTransformStream>transform</dfn>. Probably worth stating its type. > @@ -1305,6 +1350,404 @@ must run these steps: </ol> +<h3 id=interface-mixin-generictransformstream>Interface Mixin {{GenericTransformStream}}</h3> + +<p>{{GenericTransformStream}} represents the concept of a <a>transform stream</a> in IDL. It is not +a {{TransformStream}}, though it has the same interface and it delegates to one. + +<pre class=idl> +interface mixin GenericTransformStream { + readonly attribute ReadableStream readable; + readonly attribute WritableStream writable; +}; +</pre> + +<p>An object that includes GenericTransformStream always has an associated `{{GenericTransformStream}}` > + Exposed=(Window,Worker)] +interface TextDecoderStream { + USVString decode(optional BufferSource input, optional TextDecodeOptions options); +}; + +TextDecoderStream includes TextDecoderAttributes; +TextDecoderStream includes GenericTransformStream; +</pre> + +<p>A {{TextDecoderStream}} object has an associated <dfn for=TextDecoderStream>encoding</dfn>, +<dfn for=TextDecoderStream>decoder</dfn>, <dfn for=TextDecoderStream>stream</dfn>, +<dfn for=TextDecoderStream>ignore BOM flag</dfn> (initially unset), +<dfn for=TextDecoderStream>BOM seen flag</dfn> (initially unset), +<dfn for=TextDecoderStream>error mode</dfn> (initially "<code>replacement</code>"), +<dfn for=TextDecoderStream>do not flush flag</dfn> (initially unset), and +<dfn for=TextDecoderStream>transform</dfn>. For those things provided by the mixins, you should use the mixin `dfn`s, and for=TextDecoderAttributes on references to them. No need to have separate dfns/internal slots for the TextDecoder and TextDecoderStream themselves. > + +TextDecoderStream includes TextDecoderAttributes; +TextDecoderStream includes GenericTransformStream; +</pre> + +<p>A {{TextDecoderStream}} object has an associated <dfn for=TextDecoderStream>encoding</dfn>, +<dfn for=TextDecoderStream>decoder</dfn>, <dfn for=TextDecoderStream>stream</dfn>, +<dfn for=TextDecoderStream>ignore BOM flag</dfn> (initially unset), +<dfn for=TextDecoderStream>BOM seen flag</dfn> (initially unset), +<dfn for=TextDecoderStream>error mode</dfn> (initially "<code>replacement</code>"), +<dfn for=TextDecoderStream>do not flush flag</dfn> (initially unset), and +<dfn for=TextDecoderStream>transform</dfn>. + +<p>A {{TextDecoderStream}} object also has an associated <dfn id=concept-tds-serialize +for=TextDecoderStream>serialize stream</dfn> algorithm, that is identical to the <a +for=TextDecoder>serialize stream</a> algorithm for the equivalent {{TextDecoder}}. Can this move to TextDecoderAttributes since it's shared? Not sure myself. > + <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>readable</a></code> + <dd> + <p>Returns a <a>readable stream</a> whose <a>chunks</a> are strings resulting from running + <a for=TextDecoderStream>encoding</a>'s <a for=/>decoder</a> on the chunks written to + {{GenericTransformStream/writable}}. + + <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>writable</a></code> + <dd> + <p>Returns a <a>writable stream</a> which accepts {{BufferSource}} chunks and runs them through + <a for=TextDecoderStream>encoding</a>'s <a for=/>decoder</a> before making them available to + {{GenericTransformStream/readable}}. + + <p>Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a + {{ReadableStream}} source. + + <pre class=example id=example-textdecoderstream-writable><code class=lang-javascript> FWIW in Bikeshed you can use any indentation/whitespacing you want. > + <li><p>Set <var>dec</var>'s <a for=TextDecoderStream>transform</a> to <var>transform</var>. + + <li><p>Return <var>dec</var>. +</ol> + +<p>The <dfn id=concept-tds-decode-and-enqueue>decode and enqueue a chunk</dfn> algorithm, given a +{{TextDecoderStream}} <var>dec</var> and a <var>chunk</var>, runs these steps: + +<ol> + <li><p>If <a>Type</a>(<var>chunk</var>) is not Object, or <var>chunk</var> does not have an + \[[ArrayBufferData]] internal slot, or <a abstract-op>IsDetachedBuffer</a>(<var>chunk</var>) is + true, or <a abstract-op>IsSharedArrayBuffer</a>(<var>chunk</var>) is true, then return a new + promise rejected with a {{TypeError}} exception. + + <li><p><a>Push</a> a <a lt="get a copy of the buffer source">copy of</a> <var>chunk</var> to + <var>dec</var>'s <a for=TextDecoderStream>stream</a>. Hmm one day we should do a transfer variant, of this and all other web platform APIs... > + Exposed=(Window,Worker)] +interface TextEncoderStream { +}; + +TextEncoderStream includes TextEncoderAttributes; +TextEncoderStream includes GenericTransformStream; +</pre> + +<p>A {{TextEncoderStream}} object has an associated <dfn for=TextEncoderStream>encoder</dfn>, +<dfn for=TextEncoderStream>transform</dfn> and <dfn for=TextEncoderStream>pending high +surrogate</dfn> (initially null). + +<p class="note no-backref">A {{TextEncoderStream}} object offers no <var>label</var> argument as it +only supports <a>UTF-8</a>. + +<hr> Not so sure about this `<hr>` > + <li><p><a>Prepend</a> <var>token</var> to <var>input</var>. + + <li><p>Return U+FFFD. + </ol> + + <li><p>If <var>token</var> is in the range U+D800 to U+DBFF, inclusive, then set <a>pending high + surrogate</a> to <var>token</var> and return <a>continue</a>. + + <li><p>If <var>token</var> is in the range U+DC00 to U+DFFF, inclusive, then return U+FFFD. + + <li><p>Return <var>token</var>. +</ol> + +<p class=note>This is equivalent to the "<a>convert</a> a <a>JavaScript string</a> into a <a>scalar +value string</a>" algorithm from the Infra Standard, but allows for surrogate pairs that are split +between strings. Add `[[INFRA]]` for a nice stylistic flourish -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/pull/149#pullrequestreview-142027900
Received on Tuesday, 31 July 2018 16:20:02 UTC