CSP/innerHTML/JS Sandbox from Carson, Cory on 2013-05-08 (public-webappsec@w3.org from May 2013)

From: Carson, Cory <Cory.Carson@boeing.com>
Date: Tue, 7 May 2013 18:59:46 -0700
To: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <D4054D6F1BC77B409DCE63781532EE8550A21CCE09@XCH-NW-21V.nw.nos.boeing.com>
To continue the discussion around CSP/innerHTML/JS Sandbox, there are several distinct features in that topic. As a starting point, I attempt to describe two of them below. For simplicity, I informally mix together proposal, rationale for living in the webappsec charter, use case, and loose comments.



------Proposal "script-src 'unsafe-inject'"------
This is essentially extending CSP to cover all raw HTML injection points; including innerHTML, document.write, etc.

I argue that this can live in the webappsec charter under "Attack Surface Reduction", and as a natural extension of 'unsafe-eval'. It would live within the CSP deliverable.
The use case here follows the use case for script-src 'unsafe-eval': categorically blocking a common source of DOM-based XSS flaws.

Would trip on script akin to:
	<script>
	'use strict';
	document.createElement('div').innerHTML = '...HTML as javascript string';
	</script>

The "I know what I'm doing" escape perhaps notionally looks like below:
	<script>
	'use strict';
	//This proposal only
	document.createElement('div').appendChild(
	    (new DOMParser())
	    .parseFromString(
	        '<!DOCTYPE html><html><head></head><body>' +
	
	        '...HTML as javascript string... ' +
	
	        '</body></html>',
	        'text/html'
	    )
	    .body
	);
	</script>
The above does not preserve the original semantics exactly: it mitigates some basic (but not all) script injections by not executing script while parsing, and it mitigates some parser state abuses (abusing the HTML5 parsing rules for invalid HTML, like orphan closing tags and implicitly closeable li tags, to navigate the DOM tree and inject static content elsewhere).

If the second proposal also makes the cut, there must also be a clean "I know what I'm doing" escape using features of the second proposal. Example:
	<script>
	'use strict';
	document.createElement('div').innerHTML = '...HTML as javascript string';  //Trips CSP script-src 'unsafe-inject'
	document.createElement('div').innerSafeHTML = '...HTML as javascript string'; //Does not trip CSP script-src 'unsafe-inject'
	</script>

Microsoft has a proprietary implementation of this kind of thing, described at http://msdn.microsoft.com/en-us/library/ie/hh465388.aspx
Eric Chen started a vendor-neutral implementation of this kind of thing, described at https://code.google.com/p/dominatrixss-csp/


------Proposal "toStaticHTML / innerSafeHTML / lightweight sandbox / 'text/html-sandboxed' / 'Safe' HTML Whitelisting"------
Previously discussed in at least the below places. There is a need for someone - anyone - to standardize this:
http://mail.tools.ietf.org/html/draft-hodges-websec-framework-reqs-02
http://lists.w3.org/Archives/Public/public-web-security/2011Jan/0055.html
http://lists.w3.org/Archives/Public/public-whatwg-archive/2010Jan/0172.html
http://lists.w3.org/Archives/Public/public-whatwg-archive/2009Dec/0258.html

I argue that this can live in the webappsec charter under "Secure Mashups", "Sub-Resource Integrity", and intention of UISafety. I don't know which existing deliverable this would live in.
Use cases are many; one that I wish to call out is the 'styled user content' problem. Some of the use cases for seamless sandboxed iframes apply here.

The core feature is a generic HTML whitelist - elements, attributes, and other features - where the render handles the result as if it was part of the original document for performance purposes. Simply stripping JavaScript is not enough (Mario Heiderich's http://channel9.msdn.com/Events/Blue-Hat-Security-Briefings/BlueHat-Security-Briefings-Fall-2012-Sessions/BH1203). Stripping all style is too limiting. Perhaps this feature can dovetail into an inverse of UISafety's input-protection-clip feature.

toStaticHTML/innerSafeHTML/(new DOMParser()).parseFromString(string, 'text/html-sandboxed') exposes the feature to JavaScript. Lightweight sandbox exposes the feature to HTML, perhaps piggybacking the seamless attribute: <iframe sandbox seamless="lightweight" srcdoc="...html attribute encoded untrusted html..." />

Eric Chen started a vendor-neutral implementation of this kind of thing, described at https://code.google.com/p/dominatrixss-csp/
CKEditor 4.1 introduced the 'Advanced Content Filter' feature, which offers almost-but-not-quite the same thing as toStaticHTML. Cited in docs as not a security feature because it is client-side only, not server-side. Described at http://docs.ckeditor.com/#!/guide/dev_advanced_content_filter
JReg also implements this kind of thing, which credits several of you listening. Described at https://code.google.com/p/jsreg/
ES6 quasi-literals may allow an easy way to work with the resulting feature, or perhaps allow a better way to expose the feature to JavaScript. Described at http://wiki.ecmascript.org/doku.php?id=harmony:quasis#secure_content_generation
Received on Wednesday, 8 May 2013 02:00:16 UTC