[whatwg/encoding] Add a BOM sniffing hook for better integration with HTML (#203)

This change moves the BOM splitting part of the decode hook into a separate hook which does not consume any bytes of the token stream.

This will allow fixing a long-standing issue in the HTML encoding sniffing algorithm with the document's character encoding being set to the wrong result when there is a BOM (see whatwg/html#1077).

Closes #128.


<!--
    This comment and the below content is programatically generated.
    You may add a comma-separated list of anchors you'd like a
    direct link to below (e.g. #idl-serializers, #idl-sequence):

    Don't remove this comment or modify anything below this line.
    If you don't want a preview generated for this pull request,
    just replace the whole of this comment's content by "no preview"
    and remove what's below.
-->
***

### :boom: Error: write EPROTO 140222018242432:error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version:../deps/openssl/openssl/ssl/s23_clnt.c:772:
 :boom: ###

[PR Preview](https://github.com/tobie/pr-preview#pr-preview) failed to build. _(Last tried on Mar 16, 2020, 10:26 AM UTC)_.

<details>
<summary>More</summary>


PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

:rotating_light: [HTML Diff Service](http://services.w3.org/htmldiff) - The HTML Diff Service is used to create HTML diffs of the spec changes suggested in a pull request.

:link: [Related URL](https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwhatpr.org%2Fencoding%2F203%2F64e63cc.html&doc2=https%3A%2F%2Fwhatpr.org%2Fencoding%2F203.html)



_If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please [file an issue](https://github.com/tobie/pr-preview/issues/new?title=Error%20not%20surfaced%20properly&body=See%20whatwg/encoding%23203.)._
</details>

You can view, comment on, or merge this pull request online at:

  https://github.com/whatwg/encoding/pull/203

-- Commit Summary --

  * Add a BOM sniffing hook for better integration with HTML

-- File Changes --

    M encoding.bs (67)

-- Patch Links --

https://github.com/whatwg/encoding/pull/203.patch
https://github.com/whatwg/encoding/pull/203.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/pull/203

Received on Monday, 16 March 2020 10:26:27 UTC