- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Mon, 12 Jan 2009 10:54:15 -0500
Adam Barth wrote: > Extensions are bad news for content sniffing because they can often be > chosen by the attacker. For example, suppose user-uploaded content is > can be downloaded at: > > http://example.com/download.php > > In most PHP configurations, the attacker can choose whatever file > extension he likes by directing the user's browser to: > > http://example.com/download.php/whatever.foo > > And the PHP script will happily run. Right, I understand that. > Yes. We do have lots of data from opt-in user metrics from Chrome. > Here is a somewhat recent summary: > > https://crypto.stanford.edu/~abarth/research/html5/content-sniffing/ I'm not quite sure what to make of this, actually. Specifically, where is the "22.19%" number for "HTML Tags" coming from? 22.19% of what? The magic numbers stuff actually adds up to 100%, but of what? > To address your particular concern, <body occurs 6899 times less often > than <script on Web content that lacks a Content-Type (or has an bogus > Content-Type like */*), assuming I did my arithmetic correctly. OK, that's good to know. > I'm sympathetic to adding more HTML tags to the list, but I'm not sure > how far down the tail we should go. In Chrome, we went for 99.999% > compatibility, which might be a bit far down the tail. Doesn't seem that way to me, given the number of web pages out there. > http://src.chromium.org/viewvc/chrome/trunk/src/net/base/mime_sniffer.cc?view=markup Ah, ok. The relevant Gecko code is <http://hg.mozilla.org/mozilla-central/annotate/9f82199fdb9c/netwerk/streamconv/converters/nsUnknownDecoder.cpp#l477>. I'd probably be fine with trimming that list down a bit, but I'm not quite sure what the downsides of having more tags in it are here. -Boris
Received on Monday, 12 January 2009 07:54:15 UTC