Re: non CHAR characters in headers

Adrien de Croy wrote:
> 
> I'm seeing a response from a server where the Content-Type value is not 
> ASCII

As in "Content-Type: abc/xyz", and "abc/xyz" containing non-ASCII 
characters?

> the bytes are 0xFF 0xD0 0xFE etc.
> 
> This is illegal correct?  I thought if unicode were being put into 
> headers, it needs to be encoded according to one of several standards 
> e.g. RFC 2047.  There's no "=?" in there anywhere.

This was sort-of allowed in RFC2616 for headers that used the TEXT ABNF 
rule, and Content-Type doesn't do that.

BTW, in HTTPbis, we have reduced the situations where this is allowed 
(as it wasn't used anyway); I think the only place where we currently 
still have it is for the Warn header in Part 6. (I guess that's a TODO...).

> ...

What is a proxy supposed to do in such cases?  There's no information
> upon which to assume the proper format of the value.  I see 4 options.
> 
> 1. pass bytes through as is and let the client deal with it.
> 2. strip the header
> 3. block  the response with an error
> 4. try and convert.
> 
> I'm currently using option 1.  Is this deemed problematic?
> ...

I don't think so; see:

"Historically, HTTP has allowed field-content with text in the 
ISO-8859-1 [ISO-8859-1] character encoding (allowing other character 
sets through use of [RFC2047] encoding). In practice, most HTTP header 
field-values use only a subset of the US-ASCII charset [USASCII]. Newly 
defined header fields SHOULD constrain their field-values to US-ASCII 
characters. Recipients SHOULD treat other (obs-text) octets in 
field-content as opaque data." -- 
<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-06.html#rfc.section.4.2.p.3>

BR, Julian

Received on Friday, 12 June 2009 10:11:47 UTC