Re: XHR LC comment: header encoding

Anne van Kesteren wrote:
> On Mon, 07 Dec 2009 16:42:31 +0100, Julian Reschke 
> <julian.reschke@gmx.de> wrote:
>> I think XHR needs to elaborate on how non-ASCII characters in request 
>> headers are put on the wire, and how non-ASCII characters in response 
>> headers are transformed back to Javascript characters.
> 
> Hmm yeah. I somehow assumed this was easy because everything was 
> restricted to the ASCII range. It appears octets higher than 7E can 
> occur as well per HTTP.
> 
> 
>> For request headers, I would assume that the character encoding is 
>> ISO-8859-1, and if a character can't be encoded using ISO-8859-1, some 
>> kind of error handling occurs (ignore the character/ignore the 
>> header/throw?).
> 
>  From my limited testing it seems Firefox, Chrome, and Internet Explorer 
> use UTF-8 octets. E.g. "\xFF" in ECMAScript gets transmitted as C3 BF 
> (in octets). Opera sends "\xFF" as FF.
> 
> 
>> For response headers, I'd expect that the octet sequence is decoded 
>> using ISO-8859-1; so no specific error handling would be needed 
>> (although the result may be funny when the intended encoding was
> 
> Firefox, Opera, and Internet Explorer indeed do this. Chrome decodes as 
> UTF-8 as far as I can tell.
> 
> 
> I'd love some implementor feedback on the manner.
> ...

Thanks for doing the testing. The discrepancy between setting and 
getting worries me a lot :-).

 From HTTP's point of view, the header field value really is opaque. So 
you can put there anything, as long as it fits into the header field ABNF.

Of course that only helps if senders and receivers agree on the 
encoding. In my experience, server frameworks (servlet API, for 
instance) assume ISO-8859-1 here (but that probably should be tested).

For XHR 1 I think the resolution should be to leave this 
implementation-specific, and advise users not to rely on anything non-ASCII.

Best regards, Julian

Received on Monday, 4 January 2010 16:18:02 UTC