- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 26 Sep 2011 17:50:27 +0300
- To: public-webapps@w3.org
On Mon, Sep 26, 2011 at 12:46 PM, Jonas Sicking <jonas@sicking.cc> wrote: > On Fri, Sep 23, 2011 at 1:26 AM, Henri Sivonen <hsivonen@iki.fi> wrote: >> On Thu, Sep 22, 2011 at 9:54 PM, Jonas Sicking <jonas@sicking.cc> wrote: >>> I agree that there are no legacy requirements on XHR here, however I >>> don't think that that is the only thing that we should look at. We >>> should also look at what makes the feature the most useful. A extreme >>> counter-example would be that we could let XHR refuse to parse any >>> HTML page that didn't pass a validator. While this wouldn't break any >>> existing content, it would make HTML-in-XHR significantly less useful. >> >> Applying all the legacy text/html craziness to XHR could break current >> use of XHR to retrieve responseText of text/html resources (assuming >> that we want responseText for text/html work like responseText for XML >> in the sense that the same character encoding is used for responseText >> and responseXML). > > This doesn't seem to only be a problem when using "crazy" parts of > text/html charset detection. Simply looking for <meta charset> in the > first 1024 characters will change behavior and could cause page > breakage. > > Or am I missing something? Yes: WebKit already performs the <meta> prescan for text/html when retrieving responseText via XHR even though it doesn't support full HTML parsing in XHR (so responseXML is still null). http://hsivonen.iki.fi/test/moz/xhr/charset-xhr.html Thus, apps broken by the meta prescan would already be broken in WebKit (unless, of course, they browser sniff in a very strange way). And apps that wouldn't be OK with using UTF-8 as the fallback encoding when there's no HTTP-level charset, no BOM and no <meta> in the first 1024 bytes would already by broken in Gecko. >> Applying all the legacy text/html craziness to XHR would make data >> loading in programs fail in subtle and hard-to-debug ways depending on >> the browser localization and user settings. At least when loading into >> a browsing context, there's visual feedback of character misdecoding >> and the feedback can be attributed back to a given file. If >> setting-dependent misdecoding happens in the XHR data loading >> machinery of an app, it's much harder to figure out what part of the >> system the problem should be attributed to. > > Could you provide more detail here. How are you imagining this data > being used such that it's not being displayed to the user. > > I.e. can you describe an application that would break in a non-visual > way and where it would be harder to detect where the data originated > from compared to for example <iframe> usage. If a piece of text came from XHR and got injected into a visible DOM, it's not immediately obvious, which HTTP response it came from. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 26 September 2011 14:50:58 UTC