- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Tue, 5 Feb 2008 17:31:58 +0100
- To: ietf-http-wg@w3.org
Henrik Nordström wrote: > I also support removing the strict default ISO-8859-1 charset > from HTTP text/* types, downgrading it to just a mere suggestion > that if there is no charset information available then a good > guess for the text/* types is ISO-8859-1 for historical reasons. Apparently all agreed that the "strict default Latin-1" should go. One way to to handle this situation is to do whatever MIME and the specification of the Content-Type (if given) offer, and for historical reasons "Latin-1" can be a good guess. In theory. But unsurprisingly the HTML5 draft tells us that it is in practice often not good enough. Often "wannabe Latin-1" turns out to be "windows-1252". If you want to suggest a "best guess" in the HTTP spec. for historical reasons please mention windows-1252, it is an important difference for some documents: Latin-1 C1 controls may be not permitted, and windows-1252 0x80 is the only (*) backwards compatible way to say €. While that's IMO irrelevant for HTTP, if you decide to talk about it anyway let's get it right: Latin-1 is a "historical" charset, windows-1252 is the "real" legacy. Frank *: Agents knowing Latin-9 etc. would also know Unicode or €
Received on Tuesday, 5 February 2008 16:31:04 UTC