- From: Albert Lunde <atlunde@panix.com>
- Date: Thu, 27 Mar 2008 13:58:20 -0400
- To: HTTP Working Group <ietf-http-wg@w3.org>
On Thu, Mar 27, 2008 at 04:25:37PM +0000, Jamie Lokier wrote: > Anything currently sending ISO-8859-1 would almost certainly be > invalid UTF-8. This is in fact useful. It is quite common to test > whether a byte sequence is valid UTF-8, and if not, treat it as > ISO-8859-1, because the test is quite effective at distinguishing them > in practice. > > So, for informative text (i.e. non protocol) such as text after a > status code, it might be appropriate to recommend that TEXT be parsed > as UTF-8 when valid, and ISO-8859-1 otherwise. Alternate character encodings have been fruitful ground for security attacks in the past. So I'd worry about adding too many alternate ways to interpret header bytes, even if there is a way to distingush them in most cases. -- Albert Lunde albert-lunde@northwestern.edu atlunde@panix.com (new address for personal mail) albert-lunde@nwu.edu (old address)
Received on Thursday, 27 March 2008 17:58:56 UTC