W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2011

Re: Sanatising HTML content through sandboxing

From: Adam Barth <w3c@adambarth.com>
Date: Tue, 8 Nov 2011 23:54:00 -0800
Message-ID: <CAJE5ia8d+UJjBsc5Y=e7Y17SjYty=YdigSr7z-7LDuzVzCd17g@mail.gmail.com>
To: Jonas Sicking <jonas@sicking.cc>
Cc: Ryan Seddon <seddon.ryan@gmail.com>, public-webapps <public-webapps@w3.org>
Also, a div doesn't represent a security boundary.  It's difficult to
sandbox something unless you have a security boundary around it.
IMHO, an easy way to solve this problem is to just exposes an
HTMLParser object, analogous to DOMParser, which folks can use to
safely parse HTML, e.g., from XMLHttpRequest.


On Tue, Nov 8, 2011 at 11:28 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> Given that this type of sandbox would work very differently from the
> iframe sandbox, I think reusing the same attribute name would be
> confusing.
> Additionally, what's the behavior if you remove the attribute? What if
> you do elem.innerHTML += "foo" on the element after having removed the
> sandbox? Or on an elements parent?
> Or what happens if you do foo.innerHTML = bar.innerHTML where a parent
> of bar has sandbox set?
> When sanitizing, I strongly feel that we should simply remove all
> content that could execute script as to ensure that it doesn't leak
> somewhere else when markup is copied. Trying to ensure that it never
> executes, while still allowing it to exist, is too high risk IMO.
> / Jonas
> On Tue, Nov 8, 2011 at 5:21 PM, Ryan Seddon <seddon.ryan@gmail.com> wrote:
>> Right now there is no simple way to sanitise HTML content by stripping it of
>> any potentially malicious HTML such as scripts etc.
>> In the "innerHTML in DocumentFragment" thread I suggested following the
>> sandbox attribute approach that can be applied to iframes. I've moved this
>> out into its own thread, as Jonas suggested, so as not to dilute the
>> innerHTML discussion.
>> There was mention of a suggested API called innerStaticHTML as a potential
>> solution to this, I personally would prefer to reuse the sandbox approach
>> that the iframes use.
>> e.g.
>> xhr.responseText = "<script
>> src='malicious.js'></script><div><h1>contentM/h1></div>";
>> var div = document.createElement("div");
>> div.sandbox = ""; // Static content only
>> div.innerHTML = xhr.responseText;
>> document.body.appendChild(div);
>> This could also apply to a documentFragment and any other applicable DOM
>> API's, being able to let the HTML parser do what it does best would make
>> sense.
>> The advantage of this over a new API is that it would also allow the use of
>> the space separated tokens to white list certain things within the HTML
>> being parsed into the document and open it to future extension.
>> -Ryan
Received on Wednesday, 9 November 2011 07:55:03 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 October 2017 07:26:36 UTC