[whatwg/encoding] TextDecoder's serialize stream algorithm seems unnecessarily convoluted. (#154)

It appears to me that some algorithms are unnecessarily convoluted. One in particular is the [TextDecoder's serialize stream algorithm](https://encoding.spec.whatwg.org/commit-snapshots/6f9a41f3d9dbc7ba6d88f65f7ef1c139fb08d4be/#concept-td-serialize):

> 1. Let output be the empty string.
> 2. While true:
>    1. Let token be the result of reading from stream.
>    2. If encoding is UTF-8, UTF-16BE, or UTF-16LE, and ignore BOM flag and BOM seen flag are unset, then:
>       1. If token is U+FEFF, then set BOM seen flag.
>       2. Otherwise, if token is not end-of-stream, then set BOM seen flag and append token to output.
>       3. Otherwise, return output.
>    3. Otherwise, if token is not end-of-stream, then append token to output.
>    4. Otherwise, return output.

Unless I'm mistaken, I believe it could be simplified to:

> 1. Let output be the empty string.
> 2. While true:
>    1. Let token be the result of reading from stream.
>    2. If token is end-of-stream, return output.
>    3. If encoding is UTF-8, UTF-16BE, or UTF-16LE, and ignore BOM flag and BOM seen flag are unset, then:
>       1. Set BOM seen flag.
>       2. If token is U+FEFF, then continue.
>    4. Append token to output.

...because the "BOM seen flag" will only ever be set is the token is *not* "end-of-stream".

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/154

Received on Thursday, 30 August 2018 10:23:18 UTC