- From: Simon Pieters <simonp@opera.com>
- Date: Mon, 23 Mar 2015 15:14:39 +0100
- To: "Hallvord Reiar Michaelsen Steen" <hsteen@mozilla.com>
- Cc: "WebApps WG" <public-webapps@w3.org>
On Mon, 23 Mar 2015 14:32:27 +0100, Hallvord Reiar Michaelsen Steen <hsteen@mozilla.com> wrote: > On Mon, Mar 23, 2015 at 1:45 PM, Simon Pieters <simonp@opera.com> wrote: > >> On Sun, 22 Mar 2015 23:13:20 +0100, Hallvord Reiar Michaelsen Steen < >> hsteen@mozilla.com> wrote: >> >> >>> Given a server which sends UTF-16 data with a UTF-16 BOM but does *not* >>> send "charset=UTF-16" in the Content-Type header - should the browser >>> detect the encoding, or just assume UTF-8 and return mojibake-ish data? >>> >> > >> What is your test doing? From what I understand of the spec, the result >> is >> different between e.g. responseText (honors utf-16 BOM) and JSON >> response >> (always decodes as utf-8). >> >> > It tests responseText. OK. >>> I think the spec currently says one should assume UTF-8 encoding in >>> this scenario. My understanding of the spec is different from yours. Let's step through the spec. https://xhr.spec.whatwg.org/#text-response [[ Let bytes be response's body. If bytes is null, return the empty string. Let charset be the final charset. ]] final charset is null. [[ If responseType is the empty string, charset is null, and final MIME type is either null, text/xml, application/xml or ends in +xml, use the rules set forth in the XML specifications to determine the encoding. Let charset be the determined encoding. [XML] [XMLNS] ]] Which MIME type did you use in the response? BOM sniffing in XML is non-normative IIRC. For other types, see below. [[ If charset is null, set charset to utf-8. Return the result of running decode on byte stream bytes using fallback encoding charset. ]] -> https://encoding.spec.whatwg.org/#decode [[ For each of the rows in the table below, starting with the first one and going down, if the first bytes of buffer match all the bytes given in the first column, then set encoding to the encoding given in the cell in the second column of that row and set BOM seen flag. ]] This step honors the BOM. The fallback encoding is ignored. -- Simon Pieters Opera Software
Received on Monday, 23 March 2015 14:15:12 UTC