W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2009

Re: non CHAR characters in headers

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 12 Jun 2009 12:11:03 +0200
Message-ID: <4A3229B7.702@gmx.de>
To: Adrien de Croy <adrien@qbik.com>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Adrien de Croy wrote:
> I'm seeing a response from a server where the Content-Type value is not 

As in "Content-Type: abc/xyz", and "abc/xyz" containing non-ASCII 

> the bytes are 0xFF 0xD0 0xFE etc.
> This is illegal correct?  I thought if unicode were being put into 
> headers, it needs to be encoded according to one of several standards 
> e.g. RFC 2047.  There's no "=?" in there anywhere.

This was sort-of allowed in RFC2616 for headers that used the TEXT ABNF 
rule, and Content-Type doesn't do that.

BTW, in HTTPbis, we have reduced the situations where this is allowed 
(as it wasn't used anyway); I think the only place where we currently 
still have it is for the Warn header in Part 6. (I guess that's a TODO...).

> ...

What is a proxy supposed to do in such cases?  There's no information
> upon which to assume the proper format of the value.  I see 4 options.
> 1. pass bytes through as is and let the client deal with it.
> 2. strip the header
> 3. block  the response with an error
> 4. try and convert.
> I'm currently using option 1.  Is this deemed problematic?
> ...

I don't think so; see:

"Historically, HTTP has allowed field-content with text in the 
ISO-8859-1 [ISO-8859-1] character encoding (allowing other character 
sets through use of [RFC2047] encoding). In practice, most HTTP header 
field-values use only a subset of the US-ASCII charset [USASCII]. Newly 
defined header fields SHOULD constrain their field-values to US-ASCII 
characters. Recipients SHOULD treat other (obs-text) octets in 
field-content as opaque data." -- 

BR, Julian
Received on Friday, 12 June 2009 10:11:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:49 UTC