Re: PROPOSAL: i74: Encoding for non-ASCII headers from Albert Lunde on 2008-03-27 (ietf-http-wg@w3.org from January to March 2008)

From: Albert Lunde <atlunde@panix.com>
Date: Thu, 27 Mar 2008 13:58:20 -0400
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20080327175820.GA5662@panix.com>

On Thu, Mar 27, 2008 at 04:25:37PM +0000, Jamie Lokier wrote:
> Anything currently sending ISO-8859-1 would almost certainly be
> invalid UTF-8.  This is in fact useful.  It is quite common to test
> whether a byte sequence is valid UTF-8, and if not, treat it as
> ISO-8859-1, because the test is quite effective at distinguishing them
> in practice.
> 
> So, for informative text (i.e. non protocol) such as text after a
> status code, it might be appropriate to recommend that TEXT be parsed
> as UTF-8 when valid, and ISO-8859-1 otherwise.

Alternate character encodings have been fruitful ground for security
attacks in the past. So I'd worry about adding too many alternate
ways to interpret header bytes, even if there is a way to distingush
them in most cases.

-- 
    Albert Lunde  albert-lunde@northwestern.edu
                  atlunde@panix.com  (new address for personal mail)
                  albert-lunde@nwu.edu (old address)

Received on Thursday, 27 March 2008 17:58:56 UTC