Sanatising HTML content through sandboxing from Ryan Seddon on 2011-11-09 (public-webapps@w3.org from October to December 2011)

From: Ryan Seddon <seddon.ryan@gmail.com>
Date: Wed, 9 Nov 2011 12:21:55 +1100
To: public-webapps <public-webapps@w3.org>
Cc: Jonas Sicking <jonas@sicking.cc>
Message-ID: <CADsa-VdCVRyu1Kjs95DheFOic+2pwE+mhGst3HCdfF-BC5N63w@mail.gmail.com>

Right now there is no simple way to sanitise HTML content by stripping it
of any potentially malicious HTML such as scripts etc.

In the "innerHTML in DocumentFragment" thread I suggested following the
sandbox attribute approach that can be applied to iframes. I've moved this
out into its own thread, as Jonas suggested, so as not to dilute the
innerHTML discussion.

There was mention of a suggested API called *innerStaticHTML* as a
potential solution to this, I personally would prefer to reuse the sandbox
approach that the iframes use.

e.g.

xhr.responseText = "<script
src='malicious.js'></script><div><h1>contentM/h1></div>";

var div = document.createElement("div");

div.sandbox = ""; // Static content only
div.innerHTML = xhr.responseText;

document.body.appendChild(div);

This could also apply to a documentFragment and any other applicable DOM
API's, being able to let the HTML parser do what it does best would make
sense.

The advantage of this over a new API is that it would also allow the use of
the space separated tokens to white list certain things within the HTML
being parsed into the document and open it to future extension.

-Ryan

Received on Wednesday, 9 November 2011 01:22:50 UTC