- From: Albert Lunde <atlunde@panix.com>
- Date: Thu, 27 Mar 2008 13:58:20 -0400
- To: HTTP Working Group <ietf-http-wg@w3.org>
On Thu, Mar 27, 2008 at 04:25:37PM +0000, Jamie Lokier wrote:
> Anything currently sending ISO-8859-1 would almost certainly be
> invalid UTF-8. This is in fact useful. It is quite common to test
> whether a byte sequence is valid UTF-8, and if not, treat it as
> ISO-8859-1, because the test is quite effective at distinguishing them
> in practice.
>
> So, for informative text (i.e. non protocol) such as text after a
> status code, it might be appropriate to recommend that TEXT be parsed
> as UTF-8 when valid, and ISO-8859-1 otherwise.
Alternate character encodings have been fruitful ground for security
attacks in the past. So I'd worry about adding too many alternate
ways to interpret header bytes, even if there is a way to distingush
them in most cases.
--
Albert Lunde albert-lunde@northwestern.edu
atlunde@panix.com (new address for personal mail)
albert-lunde@nwu.edu (old address)
Received on Thursday, 27 March 2008 17:58:56 UTC