W3C home > Mailing lists > Public > public-webappsec@w3.org > January 2016

Re: In-browser sanitization vs. a “Safe Node” in the DOM

From: David Ross <drx@google.com>
Date: Fri, 22 Jan 2016 15:22:28 -0800
Message-ID: <CAMM+ux6MMaJn7M_aauWn7_KjWmzOsi33TdNsdL+JrHVV5_sP1Q@mail.gmail.com>
To: Jim Manico <jim.manico@owasp.org>
Cc: Michal Zalewski <lcamtuf@coredump.cx>, Chris Palmer <palmer@google.com>, Crispin Cowan <crispin@microsoft.com>, Craig Francis <craig.francis@gmail.com>, Conrad Irwin <conrad.irwin@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
> What if I just want to filter arbitrary user input before storing it in
LocalStorage for retrieval and use later? Or filter it to minify it, or the
like.

As per the writeup:

--- snip ---
outputMarkup = safeDiv.outerHTML;

...at this point outputMarkup might look something like this:

<div safety="Enabled: true; DownloadExternalContent: false;
...">[untrusted markup]</div>

It is possible to integrate markup from various sources that will
ultimately be rendered later.  Or in applications that aren’t as
complex, it’s easy to simply output untrusted markup into Safe Nodes
that are immediately added into the document.

...

FAQ

Q: If you have some markup with a Safe Node in it, is that safe?
A: Best practice: Always output unsafe markup into a Safe Node that
you (the host) have created.  If you do need to manipulate markup
containing a Safe Node and then output that markup directly onto the
page, remember to treat the Safe Node string as an atomic unit.
Untrusted markup injected into the Safe Node markup could prematurely
close the Safe Node.

...snip...

Dave

On Fri, Jan 22, 2016 at 3:17 PM, David Ross <drx@google.com> wrote:

> > There is a handful of examples where the rigidity basically
> > ruled out adoption (e.g., MSIE's old <iframe> sandbox).
> This: https://msdn.microsoft.com/en-us/library/ms534622(v=vs.85).aspx
> It came in for Hotmail, but it was never put to use AFAIK, exactly for
> the reason you describe.
>
> There is a finite list of "unsafe" things that markup / CSS can do
> when rendered on a page.  (Essential reference, of course:
> http://lcamtuf.coredump.cx/postxss/)  It is possible there are a
> couple things missing from the initial list of Safe Node policies
> requiring enforcement.  (E.g.: Link targeting is covered but we
> probably also need a way to regulate navigation more generally.)  But
> the problem is tractable.  And I don't think that sanitization baked
> into the browser provides a better approach in this regard.
>
> Another key thing here is that with either a sanitizer or Safe Node,
> it's important to pick a good set of secure defaults.  That way the
> policy problems Michal described are less likely to occur as custom
> configuration tends to be minimal.  With the sandbox attribute for
> frames, I think the use cases vary to such an extent that it would
> have been hard to set secure defaults.  E.g.: allow-scripts and
> allow-same-origin are OK independently, but not when combined.
> There's no safe default there because there are many use cases for
> either approach.  I don't see that Safe Node policies interfere with
> each other in this way and so we probably dodged this bullet.
>
> Jim said:
> > I have an aversion to different policy packages not being
> > flexible enough to be useful.
> FWIW, as per earlier in the thread, the Safe Node approach addresses
> scenarios around CSS where _sanitization_ is inflexible.  (Caveat: If
> a sanitizer is baked into the browser, all of a sudden it can pursue
> the same approach.)
>
> > Perhaps support both of these approaches? HTML
> > Programmatic sanitization and several pre-built policies?
> > That would provide both easy of use for some, and deep
> > flexibility for others. Win win win, and win?
> My argument is that Safe Node has advantages relative to sanitization
> baked into the browser.  If you can identify a legit use case that
> Safe Node can't support cleanly, but browser-based sanitization does,
> I'd probably jump right back on the sanitization bandwagon.  I wrote a
> client-side sanitizer not that long ago and I enjoy working on them.
> =)
>
> Dave
>
> On Fri, Jan 22, 2016 at 2:40 PM, Jim Manico <jim.manico@owasp.org> wrote:
> > Thank you Michal. I'll give David's proposal a closer read and comment
> > shortly.
> >
> > I remember Microsoft and their AntiXSS library providing an HTML
> Sanitizer
> > API for untrusted HTML input. It was one of the first in any major
> language
> > or framework. The first version was very permissive and useful but
> > unfortunately was vulnerable to HTML hacking and of course XSS. The
> latest
> > incarnation was fixed to be very secure, but unfortunately was not at all
> > useful because it was so restrictive. And MS is now deprecating it with
> no
> > commitment to maintain it.
> >
> > I have an aversion to different policy packages not being flexible
> enough to
> > be useful. But I will give David's proposal a deeper read and provide
> > comments more specific to his proposal.
> >
> > Perhaps support both of these approaches? HTML Programmatic sanitization
> and
> > several pre-built policies? That would provide both easy of use for some,
> > and deep flexibility for others. Win win win, and win?
> >
> > Aloha,
> > Jim
> >
> >
> >
> > On 1/22/16 5:29 PM, Michal Zalewski wrote:
> >>>
> >>> The need to inject untrusted markup into the DOM comes up all the time
> >>> and
> >>> is critical (WYSIWYG editors ,etc). But any "safe node" that limits
> what
> >>> can
> >>> render and execute will limit innovation. Each developer needs to
> support
> >>> a
> >>> different markup subset for their app, which is why policy based
> >>> sanitization is so critical to this use case.
> >>>
> >>> Take a look at CAJA JS's sanitizer, Angulars $sanitize,  and other JS
> >>> centric HTML sanitizers. They all allow the developer to set a policy
> of
> >>> what tags and attributes should be supported, and all other markup gets
> >>> stripped out.
> >>>
> >>> This is the kind of native defensive pattern we need in JavaScript,
> IMO!
> >>
> >> I think there are interesting trade-offs, and I wouldn't be too quick
> >> to praise one approach over the other. If you design use-centric
> >> "policy packages" (akin to what's captured in David's proposal), you
> >> offer safe and consistent choices to developers. The big unknown is
> >> whether the policies will be sufficiently flexible and future-proof -
> >> for example, will there be some next-gen communication app that
> >> requires a paradigm completely different from discussion forums or
> >> e-mail?
> >>
> >> There is a handful of examples where the rigidity basically ruled out
> >> adoption (e.g., MSIE's old <iframe> sandbox).
> >>
> >> The other alternative is the Lego-style policy building approach taken
> >> with CSP. Out of the countless number of CSP policies you can create,
> >> most will have inconsistent or self-defeating security properties, and
> >> building watertight ones requires a fair amount of expertise. Indeed,
> >> most CSP deployments we see today probably don't provide much in term
> >> of security. But CSP is certainly a lot more flexible and future-proof
> >> than the prepackaged approach.
> >>
> >> At the same time treating flexibility as a goal in itself can lead to
> >> absurd outcomes, too: a logical conclusion is to just provide
> >> programmatic hooks for flexible, dynamic filtering of markup, instead
> >> of any static, declarative policies. One frequently-cited approach
> >> here was Microsoft's Mutation-Event Transforms [1], and I don't think
> >> it was a step in the right direction (perhaps except as a finicky
> >> building block for more developer-friendly sanitizers).
> >>
> >> [1]
> >>
> http://research.microsoft.com/en-us/um/people/livshits/papers/pdf/hotos07.pdf
> >
> >
>
Received on Friday, 22 January 2016 23:23:17 UTC

This archive was generated by hypermail 2.3.1 : Monday, 23 October 2017 14:54:17 UTC