W3C home > Mailing lists > Public > public-webappsec@w3.org > February 2016

Re: In-browser sanitization first, "Safe Node" later?

From: Jim Manico <jim.manico@owasp.org>
Date: Mon, 8 Feb 2016 09:34:53 -1000
To: Chris Palmer <palmer@google.com>, Craig Francis <craig.francis@gmail.com>
Cc: Frederik Braun <fbraun@mozilla.com>, public-webappsec@w3.org
Message-ID: <56B8EDDD.3080308@owasp.org>
Chris,

Traditionally, folks who tried to pull of a "simple" purify function 
that "cleans" untrusted HTML without any configuration have not been 
very successful.  Take .NET getSafeHTML function. First generations were 
by-passable, easily. The "fix" was to make it so restricted that is 
became super safe but unusable. It's not deprecated.

I think a first pass at this that lets developers somehow define the 
policy of what markup is accepted will be necessary to make it usable.

But if someone can really build the magic purify function that turns any 
garbage HTML into something safe, delinted and renders as much of the 
safe content as possible, then I would be all for it.

UsableButUnsafePurify ->CongifurablePurify -> SafeButRestrictivePurify

There is a lot of work to date on allowing developers to define a 
specific policy for acceptable markup for server-side HTML sanitizers. 
If you took the best from that area of work, it could be built in a 
rather usable way.

I think SafeNode is a very respectable idea and should be implemented, 
but I feel defining an even more restrictive developer-defined custom 
policy per-app in the browser would be more secure long term.

Aloha,
Jim



On 2/8/16 8:27 AM, Chris Palmer wrote:
>
> I also prefer The Simplest Thing That Could Possibly Work. To me that 
> would seem to be the string in/string out interface, or a string 
> in/tree of DOM Nodes interface (then the caller could do something 
> like: e.appendChild(purify(bad_string)) ). Or both.
>
> On Feb 8, 2016 1:37 AM, "Craig Francis" <craig.francis@gmail.com 
> <mailto:craig.francis@gmail.com>> wrote:
>
>     As a web developer who frequently has to sanitise HTML (more so
>     server side), I would still like to see this.
>
>     But creating a safe node list will be difficult.
>
>     Take the <a> as an example, imagine a forum with a WYSIWYG (/me
>     shudders)... some forums won't like this at all (SEO spamming),
>     some may consider this safe if it has a rel=nofollow... but many
>     will forget the href="javascript:...", which is a valid attribute
>     on a valid node, but getting a click event can cause inline
>     JavaScript to run (assuming no CSP that blocks unsafe-inline).
>
>     If you know how to solve this (both as a Sanitiser or under a Safe
>     Node), then I'll be very happy.
>
>     Craig
>
>
>
>     > On 8 Feb 2016, at 08:48, Frederik Braun <fbraun@mozilla.com
>     <mailto:fbraun@mozilla.com>> wrote:
>     >
>     > Hi,
>     >
>     > I think there is a need for a client-side HTML/XSS sanitization
>     > mechanism that lives in the browser (i.e., where the parser is).
>     > AFAIU, previous discussion has shown that there are no strong
>     objections
>     > to this, but feel free to look the previous thread [1] or Mario
>     > Heiderich's presentation from Usenix Enigma [2] for further reading.
>     >
>     > I think that a first version of this spec should be a JavaScript API
>     > that consumes a string of potentially dangerous markup and returns a
>     > string that is clean.
>     >
>     > A Safe Node is certainly more interesting, but I'm afraid that
>     we (the
>     > working group) are sometimes too detached from the needs of a
>     modern web
>     > application and that we should start with providing something
>     useful *soon*.
>     > As we have seen with CSP, it's always harder to retrofit a new
>     security
>     > system to an existing architecture. But the "String In - String Out"
>     > approach will certainly fit into every app. We can still do the Safe
>     > Node in a follow-up, if the initial feedback is good.
>     >
>     > Another outcome of the reduced first version would be a public,
>     vetted
>     > and testable whitelist of safe DOM Nodes. This is useful for all
>     > existing custom sanitizers and is a positive outcome of its own [3].
>     >
>     > I expect that this first version will be easy to implement,
>     given that
>     > existing browsers use already this internally, albeit not
>     exposed to web
>     > content.
>     >
>     > In the long run, attackers might race towards finding and
>     abusing parser
>     > bugs and more quirks like those which Mario has called mXSS
>     (mutation
>     > XSS) [4]. This is good, as it will guide us to what a Safe Node will
>     > need and prove that we have indeed risen the bar beyond trivial XSS
>     > exploits.
>     >
>     > Thoughts?
>     >
>     >
>     > Cheers,
>     > Frederik
>     >
>     >
>     > [1] For the initial thread "In-browser sanitization vs. a “Safe
>     Node” in
>     > the DOM" see
>     >
>     https://lists.w3.org/Archives/Public/public-webappsec/2016Jan/thread.html
>     >
>     > [2] Link to his slides and the abstract at
>     >
>     https://www.usenix.org/conference/enigma2016/conference-program/presentation/heiderich
>     >
>     > [3] Obsolete whitelist at WHATWG wiki:
>     > https://wiki.whatwg.org/wiki/Sanitization_rules
>     >
>     > [4] "mXSS Attacks: Attacking well-secured Web-Applications
>     > by using innerHTML Mutations", see https://cure53.de/fp170.pdf
>     >
>
Received on Monday, 8 February 2016 19:35:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 18:54:54 UTC