W3C home > Mailing lists > Public > public-webapi@w3.org > July 2007

Re: [XHR2] overrideMimeType

From: Jonas Sicking <jonas@sicking.cc>
Date: Sat, 28 Jul 2007 23:38:41 -0700
Message-ID: <46AC35F1.7030806@sicking.cc>
To: Maciej Stachowiak <mjs@apple.com>, Web APIs WG <public-webapi@w3.org>

Maciej Stachowiak wrote:
> 
> On Jul 27, 2007, at 12:09 PM, Jonas Sicking wrote:
> 
>>
>> Anne van Kesteren wrote:
>>> I've been looking at overrideMimeType implementations in Gecko and 
>>> WebKit and it seems like they differ a bit. In Gecko it has to be 
>>> invoked before send(), but in WebKit it would work if you invoke it 
>>> just before getting responseXML or responseText. Neither 
>>> implementation seems to do any input checks.
>>> If you have any opinion on how it should be specified I suppose now 
>>> would be the time to air your thoughts.
>>
>> Of course I prefer the mozilla way :)
>>
>> It does seem fairly complicated to allow it to be set after the 
>> download is finished though. You do have the stream stored in 
>> .reponseBody, but at that point all encoding information has been 
>> lost. For HTML parsing (which I hope the spec will support in the 
>> future) there are a pile of rules used to guess the encoding, all of 
>> which would be useful to use, but can't be used if all you have access 
>> to is the unencoded responseBody.
> 
> Why would the encoding information be lost? The only sources of encoding 
> info are the responseText itself and http headers, both of which the 
> XMLHttpResponse needs to provide anyway.

ResponseText is not the raw byte stream gotten off the wire, it is 
already decoded into utf16 using whatever algorithm we define for 
determining the encoding. HTML decoding is a lot more complicated since 
you have to first guess an encoding, then start to parse the document, 
but if you find a

<meta http-equiv="Content-Type" content="text/html; charset=?">

Where charset is different from what you guessed, you have to restart 
from the beginning using the charset defined in the meta tag.

Yes, it would definitely be possible for the implementation to keep 
around the raw byte stream and either lazily decode responseText, or 
keep both the utf16 responseText and the raw byte stream around.

It is a bit quirky behavior though since setting overrideMimeType could 
then change the encoding and therefor both responseXML and responseText.

/ Joans
Received on Sunday, 29 July 2007 06:39:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:18:58 GMT