Re: In-browser sanitization vs. a “Safe Node” in the DOM from David Ross on 2016-01-23 (public-webappsec@w3.org from January 2016)

From: David Ross <drx@google.com>
Date: Fri, 22 Jan 2016 16:35:26 -0800
To: Jim Manico <jim.manico@owasp.org>
Cc: Michal Zalewski <lcamtuf@coredump.cx>, Chris Palmer <palmer@google.com>, Crispin Cowan <crispin@microsoft.com>, Craig Francis <craig.francis@gmail.com>, Conrad Irwin <conrad.irwin@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAMM+ux62bNVAkw+wS0RkL8ba6ipQfJn=9DYkPsn0YzK2YgUzZw@mail.gmail.com>
I should add...  In case it's not obvious, I am deliberately abusing the
blacklist / whitelist concept.

Dave

On Fri, Jan 22, 2016 at 4:10 PM, David Ross <drx@google.com> wrote:

> Yeah I'm overdue in spending more time on the initial batch of jSanity bug
> reports.  I hadn't been able to reproduce this one so far but I need to
> work through it.  If it does reproduce it would be an implementation bug in
> jSanity.  But I don't see what points to some deficiency in jSanity
> relative to some other type of sanitizer.
>
> Let's look at how Safe Node would perform in this case relative to
> sanitization.  In the case of Safe Node, it's the browser's job to enforce
> this policy:
> * Disablement of script / active content
>
> To implement this, you might imagine a check in the rendering engine at
> the point new script feeds into the script engine to be executed.  This
> check would validate the originating node's ancestors.  Script would only
> execute if the originating node does not have a Safe Node as an ancestor.
> That's a whitelist.
>
> Meanwhile the sanitizer on the other hand takes input X and produces
> output Y, hopefully with the <scriptlet> removed.  The enforcement is
> separate from the code in the browser itself, so it's just making a good
> guess as to how the browser will behave when it handles the markup.
>
> Dave
>
>
> On Fri, Jan 22, 2016 at 3:33 PM, Jim Manico <jim.manico@owasp.org> wrote:
>
>> David,
>>
>> I think your work on JSanity is excellent, but again it's a cat and mouse
>> game that is hard to win. Even recently we see a significant evasion for
>> JSanity https://github.com/Microsoft/JSanity/issues/6 that would not be
>> a problem for a rule based HTML sanitizer.
>>
>> I am not saying drop your proposal, I would just like to see
>> programmatic, rule based sanitization in addition to generalized blacklist
>> policies/sandboxes like you are suggesting.
>>
>> Aloha,
>> Jim
>>
>>
>>
>>
>> On 1/22/16 6:17 PM, David Ross wrote:
>>
>>> There is a handful of examples where the rigidity basically
>>>> ruled out adoption (e.g., MSIE's old <iframe> sandbox).
>>>>
>>> This: https://msdn.microsoft.com/en-us/library/ms534622(v=vs.85).aspx
>>> It came in for Hotmail, but it was never put to use AFAIK, exactly for
>>> the reason you describe.
>>>
>>> There is a finite list of "unsafe" things that markup / CSS can do
>>> when rendered on a page.  (Essential reference, of course:
>>> http://lcamtuf.coredump.cx/postxss/)  It is possible there are a
>>> couple things missing from the initial list of Safe Node policies
>>> requiring enforcement.  (E.g.: Link targeting is covered but we
>>> probably also need a way to regulate navigation more generally.)  But
>>> the problem is tractable.  And I don't think that sanitization baked
>>> into the browser provides a better approach in this regard.
>>>
>>> Another key thing here is that with either a sanitizer or Safe Node,
>>> it's important to pick a good set of secure defaults.  That way the
>>> policy problems Michal described are less likely to occur as custom
>>> configuration tends to be minimal.  With the sandbox attribute for
>>> frames, I think the use cases vary to such an extent that it would
>>> have been hard to set secure defaults.  E.g.: allow-scripts and
>>> allow-same-origin are OK independently, but not when combined.
>>> There's no safe default there because there are many use cases for
>>> either approach.  I don't see that Safe Node policies interfere with
>>> each other in this way and so we probably dodged this bullet.
>>>
>>> Jim said:
>>>
>>>> I have an aversion to different policy packages not being
>>>> flexible enough to be useful.
>>>>
>>> FWIW, as per earlier in the thread, the Safe Node approach addresses
>>> scenarios around CSS where _sanitization_ is inflexible.  (Caveat: If
>>> a sanitizer is baked into the browser, all of a sudden it can pursue
>>> the same approach.)
>>>
>>> Perhaps support both of these approaches? HTML
>>>> Programmatic sanitization and several pre-built policies?
>>>> That would provide both easy of use for some, and deep
>>>> flexibility for others. Win win win, and win?
>>>>
>>> My argument is that Safe Node has advantages relative to sanitization
>>> baked into the browser.  If you can identify a legit use case that
>>> Safe Node can't support cleanly, but browser-based sanitization does,
>>> I'd probably jump right back on the sanitization bandwagon.  I wrote a
>>> client-side sanitizer not that long ago and I enjoy working on them.
>>> =)
>>>
>>> Dave
>>>
>>> On Fri, Jan 22, 2016 at 2:40 PM, Jim Manico <jim.manico@owasp.org>
>>> wrote:
>>>
>>>> Thank you Michal. I'll give David's proposal a closer read and comment
>>>> shortly.
>>>>
>>>> I remember Microsoft and their AntiXSS library providing an HTML
>>>> Sanitizer
>>>> API for untrusted HTML input. It was one of the first in any major
>>>> language
>>>> or framework. The first version was very permissive and useful but
>>>> unfortunately was vulnerable to HTML hacking and of course XSS. The
>>>> latest
>>>> incarnation was fixed to be very secure, but unfortunately was not at
>>>> all
>>>> useful because it was so restrictive. And MS is now deprecating it with
>>>> no
>>>> commitment to maintain it.
>>>>
>>>> I have an aversion to different policy packages not being flexible
>>>> enough to
>>>> be useful. But I will give David's proposal a deeper read and provide
>>>> comments more specific to his proposal.
>>>>
>>>> Perhaps support both of these approaches? HTML Programmatic
>>>> sanitization and
>>>> several pre-built policies? That would provide both easy of use for
>>>> some,
>>>> and deep flexibility for others. Win win win, and win?
>>>>
>>>> Aloha,
>>>> Jim
>>>>
>>>>
>>>>
>>>> On 1/22/16 5:29 PM, Michal Zalewski wrote:
>>>>
>>>>> The need to inject untrusted markup into the DOM comes up all the time
>>>>>> and
>>>>>> is critical (WYSIWYG editors ,etc). But any "safe node" that limits
>>>>>> what
>>>>>> can
>>>>>> render and execute will limit innovation. Each developer needs to
>>>>>> support
>>>>>> a
>>>>>> different markup subset for their app, which is why policy based
>>>>>> sanitization is so critical to this use case.
>>>>>>
>>>>>> Take a look at CAJA JS's sanitizer, Angulars $sanitize,  and other JS
>>>>>> centric HTML sanitizers. They all allow the developer to set a policy
>>>>>> of
>>>>>> what tags and attributes should be supported, and all other markup
>>>>>> gets
>>>>>> stripped out.
>>>>>>
>>>>>> This is the kind of native defensive pattern we need in JavaScript,
>>>>>> IMO!
>>>>>>
>>>>> I think there are interesting trade-offs, and I wouldn't be too quick
>>>>> to praise one approach over the other. If you design use-centric
>>>>> "policy packages" (akin to what's captured in David's proposal), you
>>>>> offer safe and consistent choices to developers. The big unknown is
>>>>> whether the policies will be sufficiently flexible and future-proof -
>>>>> for example, will there be some next-gen communication app that
>>>>> requires a paradigm completely different from discussion forums or
>>>>> e-mail?
>>>>>
>>>>> There is a handful of examples where the rigidity basically ruled out
>>>>> adoption (e.g., MSIE's old <iframe> sandbox).
>>>>>
>>>>> The other alternative is the Lego-style policy building approach taken
>>>>> with CSP. Out of the countless number of CSP policies you can create,
>>>>> most will have inconsistent or self-defeating security properties, and
>>>>> building watertight ones requires a fair amount of expertise. Indeed,
>>>>> most CSP deployments we see today probably don't provide much in term
>>>>> of security. But CSP is certainly a lot more flexible and future-proof
>>>>> than the prepackaged approach.
>>>>>
>>>>> At the same time treating flexibility as a goal in itself can lead to
>>>>> absurd outcomes, too: a logical conclusion is to just provide
>>>>> programmatic hooks for flexible, dynamic filtering of markup, instead
>>>>> of any static, declarative policies. One frequently-cited approach
>>>>> here was Microsoft's Mutation-Event Transforms [1], and I don't think
>>>>> it was a step in the right direction (perhaps except as a finicky
>>>>> building block for more developer-friendly sanitizers).
>>>>>
>>>>> [1]
>>>>>
>>>>> http://research.microsoft.com/en-us/um/people/livshits/papers/pdf/hotos07.pdf
>>>>>
>>>>
>>>>
>>
>
Received on Saturday, 23 January 2016 00:36:21 UTC