[XHR2] Avoiding charset dependencies on user settings

"If final MIME type is text/html let document be Document object that
represents the response entity body parsed following the rules set
forth in the HTML specification for an HTML parser with scripting
disabled. [HTML]"

Since there's presumably no legacy content using XHR to read
responseXML for text/html (and expecting HTML parsing) and since (in
Gecko at least) responseText for non-XML tries HTTP charset and falls
back on UTF-8, it seems it doesn't make sense to implement full-blown
legacy charset craziness for text/html in XHR.

Specifically, it seems that it makes sense to skip heuristic detection
and to use UTF-8 (as opposed to Windows-1252 or a locale-dependent
value) as the fallback encoding if there's neither <meta> nor HTTP
charset, since UTF-8 is the pre-existing fallback for responseText and
responseText is already used with text/html.

As it stands, the XHR2 spec defers to a part of HTML that has
legacy-oriented optional features. It seems that it makes sense to
clamp down those options for XHR.

Henri Sivonen

Received on Thursday, 22 September 2011 13:34:03 UTC