- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Sat, 22 Oct 2011 10:53:11 -0400
On 10/22/11 6:09 AM, Daniel Glazman wrote: >>> text/plain; charset=iso-8859-1 >>> >>> This is wrong. Nothing in the MIME or the HTTP specs says such a >>> whitespace is mandatory. Whitespace is explicitely forbidden between >>> type and subtype, between parameter-name and parameter-value, but that's >>> all. AFAIC, |text/plain;charset=iso-8859-1| is perfectly valid and >>> |text/plain ; charset=iso-8859-1| is perfectly valid too. >> >> We do not want to sniff text/plain more than strictly necessary. > > Sorry, I don't understand that answer, what do you mean exactly ? Normally, when a browser receives a header of the form "text/plain ...." where ... is anything, it should treat the page as text-plain. However, there is a known bug in old Apache installations where Apache defaulted to sending a type of "text/plain" or "text/plain; charset=iso-8859-1" or "text/plain; charset=ISO-8859-1" or "text/plain; charset=UTF8" (depending on the installation) any time it didn't know what type of data the file was. Therefore, it is fairly common for random binary files to be served with those 4 exact header values. Thus, if those _exact_ strings are encountered the UA needs to sniff to make sure it's not actually binary. > If I read the document correctly, UAs are going to fallback to complex > type detection with perf and time cost just because the content-type > detection did not honour the potential presence of whitespace ??? > Really ? You read it wrong. If the whitespace doesn't match the exact values in the table, the UA will just treat the page as text/plain. It's only when the header value is exactly one of the 4 in the table that the UA will go into http://mimesniff.spec.whatwg.org/#text-or-binary -Boris
Received on Saturday, 22 October 2011 07:53:11 UTC