Re: [XHR2] Avoiding charset dependencies on user settings

On Fri, Sep 23, 2011 at 1:26 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Thu, Sep 22, 2011 at 9:54 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>> I agree that there are no legacy requirements on XHR here, however I
>> don't think that that is the only thing that we should look at. We
>> should also look at what makes the feature the most useful. A extreme
>> counter-example would be that we could let XHR refuse to parse any
>> HTML page that didn't pass a validator. While this wouldn't break any
>> existing content, it would make HTML-in-XHR significantly less useful.
>
> Applying all the legacy text/html craziness to XHR could break current
> use of XHR to retrieve responseText of text/html resources (assuming
> that we want responseText for text/html work like responseText for XML
> in the sense that the same character encoding is used for responseText
> and responseXML).

This doesn't seem to only be a problem when using "crazy" parts of
text/html charset detection. Simply looking for <meta charset> in the
first 1024 characters will change behavior and could cause page
breakage.

Or am I missing something?

In fact, it seems to me to be a more likely scenario that we now would
get the correct charset for many XHR-loads and thus fix more pages
than it breaks.

> Applying all the legacy text/html craziness to XHR would make data
> loading in programs fail in subtle and hard-to-debug ways depending on
> the browser localization and user settings. At least when loading into
> a browsing context, there's visual feedback of character misdecoding
> and the feedback can be attributed back to a given file. If
> setting-dependent misdecoding happens in the XHR data loading
> machinery of an app, it's much harder to figure out what part of the
> system the problem should be attributed to.

Could you provide more detail here. How are you imagining this data
being used such that it's not being displayed to the user.

I.e. can you describe an application that would break in a non-visual
way and where it would be harder to detect where the data originated
from compared to for example <iframe> usage.

/ Jonas

Received on Monday, 26 September 2011 09:47:46 UTC