[XHR2] Avoiding charset dependencies on user settings from Henri Sivonen on 2011-09-22 (public-webapps@w3.org from July to September 2011)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 22 Sep 2011 16:33:25 +0300
To: public-webapps@w3.org
Message-ID: <CAJQvAufBz8ScDW1yUmeEbczzJUGw=ANWfTQ5-w=tajDXYaMT5Q@mail.gmail.com>

http://dev.w3.org/2006/webapi/XMLHttpRequest-2/#document-response-entity-body
says:
"If final MIME type is text/html let document be Document object that
represents the response entity body parsed following the rules set
forth in the HTML specification for an HTML parser with scripting
disabled. [HTML]"

Since there's presumably no legacy content using XHR to read
responseXML for text/html (and expecting HTML parsing) and since (in
Gecko at least) responseText for non-XML tries HTTP charset and falls
back on UTF-8, it seems it doesn't make sense to implement full-blown
legacy charset craziness for text/html in XHR.

Specifically, it seems that it makes sense to skip heuristic detection
and to use UTF-8 (as opposed to Windows-1252 or a locale-dependent
value) as the fallback encoding if there's neither <meta> nor HTTP
charset, since UTF-8 is the pre-existing fallback for responseText and
responseText is already used with text/html.

As it stands, the XHR2 spec defers to a part of HTML that has
legacy-oriented optional features. It seems that it makes sense to
clamp down those options for XHR.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Thursday, 22 September 2011 13:34:03 UTC