- From: Jamie Lokier <jamie@shareable.org>
- Date: Sat, 13 Jun 2009 19:15:10 +0100
- To: Adam Barth <w3c@adambarth.com>
- Cc: David Morris <dwm@xpasc.com>, ietf-http-wg@w3.org
Adam Barth wrote: > On Sat, Jun 13, 2009 at 9:56 AM, Jamie Lokier<jamie@shareable.org> wrote: > > Does the sniffing document not apply to browsers looking at content on > > a local disk (therefore with no Content-Type), or does this mean it > > recommends sniffing the content without looking at the filename on the > > local disk? > > I haven't investigated this question in detail, but I suspect the > answer will vary by browser. There is very little interoperability > between browsers when interacting with the file system. > > > I'm pretty sure Firefox and the like look at the file extension when > > looking at content found on local disk. But surely it does sniffing > > at well, on local disk files? > > Do you have evidence for this belief? It should be fairly easy to > determine by looking at the source code. It's easy to determine by simply trying it. I've just created a small file with this content (not indented): <html><head></head><body> Hello, I am <b>HTML</b> </body></html> If it's called test.html, it will display as HTML. If it's called test.txt, it will display as plain text. ==> If it's called test.foo, it will display as HTML. ==> If it's called just test (no extension), it will display as HTML. But if we change the file slightly, putting a single character x in front like this: x<html><head></head><body> Hello, I am <b>HTML</b> </body></html> If it's called test.html, it will display as HTML. If it's called test.txt, it will display as plain text. ==> If it's called test.foo, it will display as plain text. ==> If it's called just test (no extension), it will display as plain text. Therefore Firefox (3.0.10) does sniff a local file to determine how to display it, and the sniffing algorithm (or whether to apply it) _does_ depend on the file extension. > > Does the sniffing document not apply at all in that case, or is there > > a different sniffing algorithm used which remains undocumented? > > There is only one sniffing algorithm. The question is only whether > its applied in this case. More precisely, the question is whether the > "file" protocol handler assigns a media type using OS-specific > functionality before handing the response off to the next layer, where > content sniffing is performed on various media types (e.g., the empty > media type). It's clear from trying it that Firefox applies a sniffing algorithm to local files, and either it is influenced by the file's extension, or decided whether to apply the algorithm at all depending on the extension. I don't know what it does with FTP, but I wouldn't be surprised if it's the same as local files. Now, let's get back to HTTP. I've done the same test as above with HTTP in the same Firefox. If the Content-Type is text/plain or text/html, then Firefox honours the Content-Type, independent of whether the content has "x" at the start in these two test files. If the Content-Type is application/octet-stream, then Firefox does different things depending on the URL's file extension. If it ends with .html, Firefox shows an error dialog(!), otherwise it offers to open the file in an application of your choice. If the Content-Type is blank, because I couldn't persuade Apache to omit it completely, then Firefox behaviour depends on the URL's file extension. <html><head></head><body> Hello, I am <b>HTML</b> </body></html> If it's called http://.../test.html, it will display as HTML. If it's called http://.../test.txt, it will display as plain text. If it's called http://.../test.foo, it will display as plain text. If it's called http://.../test, it will display as plain text. x<html><head></head><body> Hello, I am <b>HTML</b> </body></html> If it's called http://.../test.html, it will display as plain text. If it's called http://.../test.txt, it will display as plain text. If it's called http://.../test.foo, it will display as plain text. If it's called http://.../test, it will display as plain text. As you see, Firefox applies a similar sniffing test in these examples to decide whether to treat the resource as HTML or plain text, and it does use the URL's file extension in making it's decision. However, it doesn't use quite the same algorithm as for local files, as you can see from the .html and .foo extension differences. In the bigger picture, my point is that sniffing is used in practice, in a major browser, for local files as well as HTTP (and FTP but not shown here), and the decision about _whether_ to use it (at least) does depend on the file extension for HTTP as well as for local files. It would be good to document and standardise when the sniffing algorithm is applied, dependent on file/URL extensions, for the same reason that it is good to document and standardise what the sniffing algorithm is. I don't know from these tests if the sniffing is simply switch on/off depending on file extensions or if it is influenced in a more fine-grained way. -- Jamie
Received on Saturday, 13 June 2009 18:15:44 UTC