[XHR] UTF-16 - do content sniffing or not?

Hi,
I've just added a test loading UTF-16 data with XHR, and it exposes an
implementation difference that should probably be discussed:

Given a server which sends UTF-16 data with a UTF-16 BOM but does *not*
send "charset=UTF-16" in the Content-Type header - should the browser
detect the encoding, or just assume UTF-8 and return mojibake-ish data?

Per my test, Chrome detects the UTF-16 encoding while Gecko doesn't. I
think the spec currently says one should assume UTF-8 encoding in this
scenario. Are WebKit/Blink - developers OK with changing their
implementation?

(The test currently asserts detecting UTF-16 is correct, pending discussion
and clarification.)

-Hallvord

Received on Sunday, 22 March 2015 22:13:54 UTC