- From: Hanno Böck <hanno@hboeck.de>
- Date: Fri, 11 Jan 2019 08:38:39 +0100
- To: www-tag@w3.org
Hello, I recently looked into a security issue that happens relatively easily in the way web severs and web applications are designed, I have multiple practical instances in popular web applications (Wordpress, Joomla, Mailman). Now one way to avoid this issue would be to let web servers send a default content-type / mimetype. However the W3C Authoritative Metadata standard, explicitly says this shall not happen. The problem is like this: * A web application allows uploading any kind of "unusual" file type that is not part of the server's mime.types. (The mime.types is in no way standardized and differs significantly between distros, so there can be practically no expectaiton on what that exactly means.) Let's use a fictional file format .aaa as an example. * The web server will either guess the content type on its own or send it without a content type and then the browser will guess the content type. Both are bad. (Some web browsers - notably Edge+Firefox will even guess the content when the "X-Content-Type-Options: nosniff" header is sent, because that originally was only designed for .js and .css files and thus won't prevent HTML sniffing.) * An attacker can upload a file example.aaa that contains html code and javascript. * Calling that file will execute the javascript - you have an XSS. This is a tricky to avoid issues. A web application like wordpress can hardly do anything about it (except maybe not allowing uploads of any "unusual" file types, but as written above, this is hard to define). Now the safest way for a server to prevent this would be: a) don't guess content types. b) send a "safe" content type (e.g. application/octet-stream) for each file that has no extension that can be assigned via mime.types. Now the Authoritative Metadata standard says this SHOULD NOT happen: "Good Practice Server software designers (implementers) SHOULD NOT specify default representation metadata, such as media type, character encoding, or content language, within the standard configuration shipped with the server. Instead of specifying a default for metadata, it is better for representations to be sent without that metadata. That allows the recipient to guess the metadata instead of being forced to either accept incorrect metadata or be tempted to violate Web architecture by ignoring it." In Apache this even led to complete removal of that option, going even a step further (not just not doing this by default, but actively removing any way for users to do this). Contrary to that Nginx sends a default content type. I don't see any good justification for that option. It says "That allows the recipient to guess the metadata instead of being forced to either accept incorrect metadata or be tempted to violate Web architecture by ignoring it." But that's not a good thing: It's a security risk. The whole document doesn't mention XSS or Cross Site Scripting, so I wonder if this has been considered in any way. I'm writing you to hopefully better understand why that decision was made and what the reasons were. As far as I see it there's a security flaw and one of the most obvious and likely robust fixes is forbidden by a standard without any good justification. I think the standard should be changed. [1] https://www.w3.org/2001/tag/doc/mime-respect#reducing-inconsistency -- Hanno Böck https://hboeck.de/ mail/jabber: hanno@hboeck.de GPG: FE73757FA60E4E21B937579FA5880072BBB51E42
Received on Friday, 11 January 2019 07:46:45 UTC