W3C home > Mailing lists > Public > public-webappsec@w3.org > January 2016

Re: In-browser sanitization vs. a “Safe Node” in the DOM

From: Jim Manico <jim.manico@owasp.org>
Date: Fri, 22 Jan 2016 17:11:15 -0500
To: David Ross <drx@google.com>, Chris Palmer <palmer@google.com>, Crispin Cowan <crispin@microsoft.com>
Cc: Craig Francis <craig.francis@gmail.com>, Conrad Irwin <conrad.irwin@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <56A2A903.8070503@owasp.org>
The need to inject untrusted markup into the DOM comes up all the time 
and is critical (WYSIWYG editors ,etc). But any "safe node" that limits 
what can render and execute will limit innovation. Each developer needs 
to support a different markup subset for their app, which is why policy 
based sanitization is so critical to this use case.

Take a look at CAJA JS's sanitizer, Angulars $sanitize,  and other JS 
centric HTML sanitizers. They all allow the developer to set a policy of 
what tags and attributes should be supported, and all other markup gets 
stripped out.

This is the kind of native defensive pattern we need in JavaScript, IMO!

Aloha,
Jim

On 1/22/16 5:03 PM, David Ross wrote:
>> How is CSP not sufficient?
> CSP operates on a per-page basis.  Here's the canonical use case for
> sanitization (and also Safe Node): Fetch bits of markup via XHR and
> just plop them into the existing DOM in various places, Tsafely.
>
> I started with the assumption that client-side sanitization is coming
> to the browser.  This is obviously not a given, but discussion about
> the possibility is what initiated my train of thought.  The Safe Node
> proposal attempts to achieve the same result but in a way that I argue
> has certain advantages over client-side sanitization baked into the
> browser.
>
> Dave
>
> On Fri, Jan 22, 2016 at 1:53 PM, David Ross <drx@google.com> wrote:
>>> How, exactly, can we compute whether a given string is an anti-CSRF
>>> defense token, and how, exactly, can we compute if CSS just leaked
>>> it to the attacker? I don't immediately see how that security guarantee
>>> is computable.
>> We should probably talk about a specific version of the attack to make
>> sure we're on the same page.  (Maybe this one:
>> http://html5sec.org/webkit/test)
>>
>> I think that if you're only detecting what you're describing above,
>> it's too late.  My expectation is that CSS defined within the Safe
>> Node would have no affect on the DOM outside of the Safe Node.  Is
>> there some reason why this is not possible to implement or that it
>> would be ineffective at addressing the issue?
>>
>>> Basically, my argument is that the con — that a general purpose
>>> filtering HTML parser would be useful — is huge, and also
>>> sufficiently covers your intended goal (duct-taping an anti-pattern).
>>> Thus, if we do anything, we should do that.
>> Mmm, I lost you here...  How is that a con?  It sounds like just an
>> assertion, and one that I wouldn't argue with.  And my intended goal
>> is not to get rid of innerHTML, but I'm happy to help by removing
>> innerHTML the design pattern I originally suggested.
>>
>>> I would rather deprecate innerHTML, yes. But at least I can easily grep for "assigns to innerHTML but there is no call to purify(...)".
>> In any event the Safe Node idea is not dependent on innerHTML.  I'm
>> happy to cut it out!
>>
>> Dave
>>
>> On Fri, Jan 22, 2016 at 1:32 PM, Chris Palmer <palmer@google.com> wrote:
>>> On Fri, Jan 22, 2016 at 1:00 PM, David Ross <drx@google.com> wrote:
>>>
>>>>> For example, what is the actual mechanism/algorith/heuristic thsi API
>>>>> would use to enforce the Safe CSS set of policies?
>>>> My argument is that it's smarter to implement policy enforcement
>>>> within Blink / WebKit than it is to implement the same thing reliably
>>>> within custom sanitizer code stapled onto Blink / WebKit.  For
>>>> example, consider the case of the policy that disables script.  The
>>>> browser can quite definitively disable script execution initiated from
>>>> within a particular DOM node.  However a sanitizer has to whitelist
>>>> all the elements and attributes it suspects are capable of initiating
>>>> script execution.  Pushing the policy enforcement to be implemented
>>>> close to the code that is being regulated makes it less likely that
>>>> new browser features will subvert the policy enforcement.  Some things
>>>> that are simply difficult or impossible for sanitizers to regulate in
>>>> a granular way (eg: CSS) are easier to handle with a Safe Node.
>>>
>>> That doesn't answer the question. How, exactly, can we compute whether a
>>> given string is an anti-CSRF defense token, and how, exactly, can we compute
>>> if CSS just leaked it to the attacker? I don't immediately see how that
>>> security guarantee is computable.
>>>
>>>>> element.innerHTML = purify(untrustworthyString, options...)
>>>>> That seems convenient enough for callers?
>>>> See the pros / cons in my writeup.
>>>
>>> Basically, my argument is that the con — that a general purpose filtering
>>> HTML parser would be useful — is huge, and also sufficiently covers your
>>> intended goal (duct-taping an anti-pattern). Thus, if we do anything, we
>>> should do that.
>>>
>>> But I remain skeptical of the goal.
>>>
>>>> And wait, didn't you just argue
>>>> that we shouldn't make use of .innerHTML given it's an anti-pattern?
>>>> =)
>>>
>>> I would rather deprecate innerHTML, yes. But at least I can easily grep for
>>> "assigns to innerHTML but there is no call to purify(...)".
Received on Friday, 22 January 2016 22:11:47 UTC

This archive was generated by hypermail 2.3.1 : Monday, 23 October 2017 14:54:17 UTC