- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sat, 06 Jun 2009 00:28:30 +0200
- To: Ian Hickson <ian@hixie.ch>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
* Ian Hickson wrote: >The original reason for this was that I did not want to sniff as a >particular type a file that only contained a BOM, since it is more likely >that this is an error and that the file is really some other encoding. I do not really share that assessment, but as it is written in the draft, you end up "sniffing" text/plain either way. >> As I read the draft, UTF-32LE encoded text/plain documents will be >> sniffed as text/plain because they have a UTF-16LE BOM; UTF-32BE encoded >> text/plain documents will be sniffed as application/octet- stream. This >> is inconsistent and confusing (there is suddenly some doubt whether you >> treat the document as UTF-16 or UTF-32, and while browsers might not >> support UTF-32, other applications will). > >We're explicitly not supporting UTF-32. For more details see HTML5. I fail to see the relevance. The draft is unclear and misleading with respect to the handling of UTF-32 encoded text/plain documents. There is nothing that "HTML5" could say as a remedy, at least not until the draft references "HTML5" to that end. The draft has a similar problem with the iso-8859-1 cases in 3.3: if such documents start with what appears to be a BOM, then the BOM is the reason for "sniffing" them as text/plain, casting doubt whether you then also treat them as in some UTF encoding or not. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Friday, 5 June 2009 22:29:05 UTC