- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Sun, 13 Dec 2009 19:14:21 -0500
On Fri, Dec 11, 2009 at 11:18 PM, Michal Zalewski <lcamtuf at coredump.cx> wrote: > The ability to sandbox SPANs or DIVs using a token-guarded approach > (<span sandbox="random_token"></span sandbox="same_token">) is, on the > other hand, considerably easier on the developer, and probably has a > very similar implementation complexity. Well, the problem this random token thing is trying to address is that the untrusted content could just close the tag. (I fondly remember my days on Geocities, when we would add <noscript><noscript> to the end of our pages to try to get rid of the auto-injected ads.) But it's kind of hacky and might be prone to failure, and the syntax is really unpleasant (especially for XML compatibility). So instead, why not just use the standard escaping mechanisms we already have? Allow a sandbox attribute on all elements that can contain phrasing or flow content. Any such element with a sandbox attribute will be required to contain no literal <>'" before the closing tag. If any of those four characters is encountered, the element is treated as having no contents. Otherwise, the browser unescapes all characters with special meanings ("<" -> "<", ">" -> ">", "&" -> "&", etc.) and then treats the resulting string as the inner HTML of the element, parsing it like regular HTML, but the contents are sandboxed. Examples: <span sandbox>This span will work normally, except for being sandboxed.</span> <span sandbox>This span will be <em>empty</em> in the DOM, even though it contains no evil content, because otherwise authors will forget to escape the contents of the sandbox.</span> <span sandbox><span>But this span will have another span as its child, sandboxed. The regular parser sees no entities here, only a nested span!</span></span> <span sandbox>It would be safe to allow this to work, since it only contains an apostrophe, but let's not, so that lack of escaping is easier to catch. This span is therefore also empty.</span> I think this is easier to use than having to generate a random token, and also more secure. If your code isn't escaping things right, you'll quickly notice when your blog comments all vanish. This is even backward-compatible, in a certain sense. <jail> would be unsafe to serve with untrusted contents until all UAs reliably support it. This would be perfectly safe in all browsers, it would just display poorly in old browsers if there's any HTML markup in the content. What do people think of this syntax?
Received on Sunday, 13 December 2009 16:14:21 UTC