Re: In-browser sanitization vs. a “Safe Node” in the DOM from David Ross on 2016-01-22 (public-webappsec@w3.org from January 2016)

From: David Ross <drx@google.com>
Date: Fri, 22 Jan 2016 14:30:06 -0800
To: Jim Manico <jim.manico@owasp.org>
Cc: Chris Palmer <palmer@google.com>, Crispin Cowan <crispin@microsoft.com>, Craig Francis <craig.francis@gmail.com>, Conrad Irwin <conrad.irwin@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAMM+ux7Qd=7Jv0Rkyw2XJonGk59wr4N07zcZ=bpT+DqeaJAVrg@mail.gmail.com>
> So you propose to adopt that behavior for CSS applied to children of the Safe Node?
Mario had a much harder problem to solve because the malicious CSS was
not separated from the rest of the page via a security boundary (Safe
Node).  The solution in the Safe Node world is for CSS within the Safe
Node to not affect the markup outside the Safe Node.  That's simple
and sensible.  I suspect this will be possible, easier to implement
than a C++ HTML sanitizer, and will effectively solve the identified
problem.

> I don't know. It doesn't sound easy to implement the guarantee, but maybe it is.
I am also not an expert, but I would bet the opposite.  I would be
surprised to find out that it's not possible for the rendering engine
to isolate CSS under a given node.

> You listed it as the sole con:
What I mean is that I have not seen a (non-niche) use case for
browser-based sanitization that Safe Node doesn't address.  It sounds
like maybe you're saying that this is a huge problem, but if so what
is the missing use case that Safe Node doesn't cover?



On Fri, Jan 22, 2016 at 2:11 PM, Jim Manico <jim.manico@owasp.org> wrote:
> The need to inject untrusted markup into the DOM comes up all the time and
> is critical (WYSIWYG editors ,etc). But any "safe node" that limits what can
> render and execute will limit innovation. Each developer needs to support a
> different markup subset for their app, which is why policy based
> sanitization is so critical to this use case.
>
> Take a look at CAJA JS's sanitizer, Angulars $sanitize,  and other JS
> centric HTML sanitizers. They all allow the developer to set a policy of
> what tags and attributes should be supported, and all other markup gets
> stripped out.
>
> This is the kind of native defensive pattern we need in JavaScript, IMO!
>
> Aloha,
> Jim
>
> On 1/22/16 5:03 PM, David Ross wrote:
>>>
>>> How is CSP not sufficient?
>>
>> CSP operates on a per-page basis.  Here's the canonical use case for
>> sanitization (and also Safe Node): Fetch bits of markup via XHR and
>> just plop them into the existing DOM in various places, Tsafely.
>>
>>
>> I started with the assumption that client-side sanitization is coming
>> to the browser.  This is obviously not a given, but discussion about
>> the possibility is what initiated my train of thought.  The Safe Node
>> proposal attempts to achieve the same result but in a way that I argue
>> has certain advantages over client-side sanitization baked into the
>> browser.
>>
>> Dave
>>
>> On Fri, Jan 22, 2016 at 1:53 PM, David Ross <drx@google.com> wrote:
>>>>
>>>> How, exactly, can we compute whether a given string is an anti-CSRF
>>>> defense token, and how, exactly, can we compute if CSS just leaked
>>>> it to the attacker? I don't immediately see how that security guarantee
>>>> is computable.
>>>
>>> We should probably talk about a specific version of the attack to make
>>> sure we're on the same page.  (Maybe this one:
>>> http://html5sec.org/webkit/test)
>>>
>>> I think that if you're only detecting what you're describing above,
>>> it's too late.  My expectation is that CSS defined within the Safe
>>> Node would have no affect on the DOM outside of the Safe Node.  Is
>>> there some reason why this is not possible to implement or that it
>>> would be ineffective at addressing the issue?
>>>
>>>> Basically, my argument is that the con — that a general purpose
>>>> filtering HTML parser would be useful — is huge, and also
>>>> sufficiently covers your intended goal (duct-taping an anti-pattern).
>>>> Thus, if we do anything, we should do that.
>>>
>>> Mmm, I lost you here...  How is that a con?  It sounds like just an
>>> assertion, and one that I wouldn't argue with.  And my intended goal
>>> is not to get rid of innerHTML, but I'm happy to help by removing
>>> innerHTML the design pattern I originally suggested.
>>>
>>>> I would rather deprecate innerHTML, yes. But at least I can easily grep
>>>> for "assigns to innerHTML but there is no call to purify(...)".
>>>
>>> In any event the Safe Node idea is not dependent on innerHTML.  I'm
>>> happy to cut it out!
>>>
>>> Dave
>>>
>>> On Fri, Jan 22, 2016 at 1:32 PM, Chris Palmer <palmer@google.com> wrote:
>>>>
>>>> On Fri, Jan 22, 2016 at 1:00 PM, David Ross <drx@google.com> wrote:
>>>>
>>>>>> For example, what is the actual mechanism/algorith/heuristic thsi API
>>>>>> would use to enforce the Safe CSS set of policies?
>>>>>
>>>>> My argument is that it's smarter to implement policy enforcement
>>>>> within Blink / WebKit than it is to implement the same thing reliably
>>>>> within custom sanitizer code stapled onto Blink / WebKit.  For
>>>>> example, consider the case of the policy that disables script.  The
>>>>> browser can quite definitively disable script execution initiated from
>>>>> within a particular DOM node.  However a sanitizer has to whitelist
>>>>> all the elements and attributes it suspects are capable of initiating
>>>>> script execution.  Pushing the policy enforcement to be implemented
>>>>> close to the code that is being regulated makes it less likely that
>>>>> new browser features will subvert the policy enforcement.  Some things
>>>>> that are simply difficult or impossible for sanitizers to regulate in
>>>>> a granular way (eg: CSS) are easier to handle with a Safe Node.
>>>>
>>>>
>>>> That doesn't answer the question. How, exactly, can we compute whether a
>>>> given string is an anti-CSRF defense token, and how, exactly, can we
>>>> compute
>>>> if CSS just leaked it to the attacker? I don't immediately see how that
>>>> security guarantee is computable.
>>>>
>>>>>> element.innerHTML = purify(untrustworthyString, options...)
>>>>>> That seems convenient enough for callers?
>>>>>
>>>>> See the pros / cons in my writeup.
>>>>
>>>>
>>>> Basically, my argument is that the con — that a general purpose
>>>> filtering
>>>> HTML parser would be useful — is huge, and also sufficiently covers your
>>>> intended goal (duct-taping an anti-pattern). Thus, if we do anything, we
>>>> should do that.
>>>>
>>>> But I remain skeptical of the goal.
>>>>
>>>>> And wait, didn't you just argue
>>>>> that we shouldn't make use of .innerHTML given it's an anti-pattern?
>>>>> =)
>>>>
>>>>
>>>> I would rather deprecate innerHTML, yes. But at least I can easily grep
>>>> for
>>>> "assigns to innerHTML but there is no call to purify(...)".
>
>
Received on Friday, 22 January 2016 22:30:56 UTC