Re: [whatwg/encoding] Amount of bytes to sniff for encoding detection (#102)

> We reparse if needed with the detected encoding for this and never run the detector on any further bytes.

Earlier in the sentence you said you wouldn't feed things into the parser so this wouldn't apply, right?

>  think you meant to say UTF-7?

No, UTF-7 must obviously not be detected. It's not an encoding to begin with per the Encoding Standard.


It's still unclear to me what motivated the CED detector as other browsers don't have a similar complex thing. Also looking through the code of the CED detector it doesn't seem aligned with the Encoding Standard at all. https://github.com/google/compact_enc_det/blob/master/util/encodings/encodings.cc lists very different encodings and labels.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/102#issuecomment-302852325

Received on Saturday, 20 May 2017 05:33:43 UTC