- From: Michal Zalewski <lcamtuf@coredump.cx>
- Date: Fri, 11 Dec 2009 20:18:46 -0800
Hi folks, So, we were having some internal discussions about the IFRAME sandbox attribute; Adam Barth suggested it would be more productive to bring some of the points I was making on the mailing list instead. I think the attribute is an excellent idea, and close to the dream design we talked about internally for a while. I do have some peripheral concerns, though, and seems like now is the time to bring them up! Starting with two high-level comments: although I understand the simplicity, and hence the appeal, of sandboxed IFRAMEs, I do fear that they will be very hard on web developers - and hence of limited utility. In particular: 1) IFRAME semantics make it exceedingly cumbersome to sandbox short snippets of text, and this task is perhaps the most common and pressing XSS-related challenge. Unless the document is constructed on client side by JavaScript, sites would need to use opaque data: URLs, or put up with a lot of additional HTTP roundtrips, to utilize sandboxed IFRAMEs for this purpose. [ There is also the problem of formatting and positioning IFRAME content, although the seamless attribute would fix this. ] The ability to sandbox SPANs or DIVs using a token-guarded approach (<span sandbox="random_token"></span sandbox="same_token">) is, on the other hand, considerably easier on the developer, and probably has a very similar implementation complexity. 2) Renderers suck dealing with IFRAMEs, and will probably continue to do so for time being. This means that a typical, moderately complex application (say, as a discussion forum or a social site), where hundreds of user-controlled strings may need to be present to display user content - the mechanism would have an unacceptable load time and memory footprint. In fact, people are already coming up with lightweight alternatives with a significant functionality overlap (and different security controls). Microsoft has toStaticHTML(), while a standardized implementation is being discussed here right now in a separate thread. Isn't the benefit of keeping the design slightly simpler (and realistically, limited to relatively few usage scenarios) negated by the fact that alternative solutions to other narrow problems would need to emerge elsewhere? The browser coming with several different script sanitizers with completely different APIs and security controls does not strike me as a desirable outcome (all the flavors of SOP are a testament to this). If the anser is not a strong "no", maybe the token-guarded DIV / SPAN approach is a better alternative? Now, that aside - on a more pragmatic level, I have two extra comments: 1) The utility of the SOP sandboxing behavior outlined in the spec is diminished if we have no way to actually *enforce* that the IFRAMEd resource would only be rendered in such a context. If I am serving user-supplied, unsanitized HTML, it is obviously safe to do <iframe sandbox src="show.cgi?id=1234"></iframe> - but where do we prevent the attacker from calling http://my_site/show.cgi?id=1234 directly, and bypassing the filter? There are two cases where the mechanism still offers some protection: 1.1) If I make IFRAMEd URLs unpredictable with the use of security tokens - but if people were likely to get this right, we wouldn't have XSRF and related issues on the web, 1.2) f I point the IFRAME to a non-same-origin domain - but if I can do this, and work out the non-trivial authentication challenges in such a case, I largely don't need a SOP sandbox to begin with: I can just use <unique_id>.sandboxdomain.com. In fact, many sites I know of do this right now. It strikes me that this mechanism would make a whole lot more sense if supported on HTTP header level, instead: "X-SOP-Sandbox: 1"; in its current shape, it is defensible perhaps if aided by Mozilla's CSP. Otherwise, it's an error-prone detail, and we should at the very least outline why it's very difficult to get it right in the spec. 2) The utility of the "no form submission" mode is limited to certain very specific anti-phishing uses. While this does not invalidate it, it makes it tempting to mention two other modes we discussed internally, and that probably fall into the same bucket: 2.1) The ability to disable loading of external resources (images, scripts, etc) in the sandboxed document. The common usage scenario is when you do not want the displayed document to "phone home" for privacy reasons, for example in a web mail system. 2.2) The ability to disable HTML parsing. On IFRAMEs, this can actually be approximated with the excommunicated <plaintext> tag, or with Content-Type: text/plain / data:text/plain,. On token-guarded SPANs or DIVs, however, it would be pretty damn useful for displaying text content without the need to escape &, <, >, etc. "Pure" security benefit is limited, but as a phishing prevention and display correctness measure, it makes sense. Well, that's it. Hope this does not come off as a complete rant :P Cheers, /mz
Received on Friday, 11 December 2009 20:18:46 UTC