[XHR2] overrideMimeType behavior from James Robinson on 2010-10-05 (public-webapps@w3.org from October to December 2010)

From: James Robinson <jamesr@google.com>
Date: Mon, 4 Oct 2010 23:23:33 -0700
To: Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <AANLkTim7wqWzHSSMbNh-WEa4FJOaV0viPknejUERYK+c@mail.gmail.com>

One issue raised briefly when discussing ArrayBuffer integration but not
resolved was how to handle overrideMimeType().  The issue is whether calling
overrideMimeType() can cause already downloaded data to be re-interpreted
with a different charset.  From my reading of the spec, this is the case.
 Calling overrideMimeType() with a specified charset sets the current
override charset which overrides the final charset which is used in the text
response entity body algorithm to decode the response entity body (i.e.
bytes from the network) into a DOMString.

However WebKit and Gecko currently do not behave in this way and while I
can't speak for the rest of the WebKit community I would be reluctant to
change WebKit to what the spec currently states.  In both of these
implementations the override mime type is checked once when the HTTP headers
are received from the network in order to determine how to decode the data.
 From that point on, setting overrideMimeType() is a no-op.  In addition, in
the current WebKit implementation we do not preserve the raw bytes from the
network after decoding them to UTF-16 in order to produce the .responseText
DOMString.  Since conversion from an arbitrary charset to UTF-16 is not
always invertible, this makes the current semantics impossible to implement
without keeping an extra copy of the data around.  I would strongly prefer
not to keep an extra copy if possible since this will only be memory bloat
for an extremely rare use case.

I propose that overrideMimeType() throw INVALID_STATE_ERR if called when the
send() flag is true.  This should still allow authors to declare a mime type
and optionally a charset on requests without requiring an arbitrary
re-decoding of data after it has been received.

- James

PS: There's a related discussion about how to handle encoding semantics and
the .responseArrayBuffer property, but that's for another thread.

Received on Tuesday, 5 October 2010 06:24:03 UTC