Re: non CHAR characters in headers

On 2009/06/12 14:30, Adrien de Croy wrote:
>
> I'm seeing a response from a server where the Content-Type value is not
> ASCII
>
> the bytes are 0xFF 0xD0 0xFE etc.
>
> This is illegal correct? I thought if unicode were being put into
> headers, it needs to be encoded according to one of several standards
> e.g. RFC 2047. There's no "=?" in there anywhere.

I general, I'm rather sympathetic to putting Unicode in headers where it 
makes sense. But for Content-Type, it doesn't make sense at all (unless 
it's part of a parameter to some exotic content type, but I think even 
most parameters for content types are ASCII-only by design, because they 
are simple predefined tokens or numbers).

Also, the right way to use Unicode in HTTP headers, if at all, is to use 
UTF-8. Neither 0xFF nor 0xFE can occur in UTF-8. So this is doubly suspect.

> What is a proxy supposed to do in such cases? There's no information
> upon which to assume the proper format of the value. I see 4 options.
>
> 1. pass bytes through as is and let the client deal with it.
> 2. strip the header
> 3. block the response with an error
> 4. try and convert.

What do you mean with 4? It looks like you're getting garbage, what do 
you want to convert that to?

> I'm currently using option 1. Is this deemed problematic?

I don't know what the spec says, but for headers that the proxy is not 
concerned about, it's probably the right thing to do. The client will 
have to deal with (which may just mean being robust against) such cases 
anyway. Where the proxy is concerned about the header, the situation may 
be a bit different (in this case e.g. if there is a Vary by Content-Type).

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

Received on Friday, 12 June 2009 06:18:22 UTC