Re: In-browser sanitization vs. a “Safe Node” in the DOM from David Ross on 2016-01-26 (public-webappsec@w3.org from January 2016)

From: David Ross <drx@google.com>
Date: Tue, 26 Jan 2016 13:45:01 -0800
To: Andrew Sutherland <asutherland@asutherland.org>
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAMM+ux7oGftZ4M3ThTNYrXw6i59TUVweirZVtw6OsKtqCzeiAA@mail.gmail.com>
> If I querySelectorAll()/getElementById() in my main page, I know that
> I'm not accidentally going to pull out an attacker-provided element.
I wonder what portion of accesses like these are going to be safe vs.
unsafe.  Maybe it makes sense to enable the use of shadow DOM on the
Safe Node by default (with the option to opt-out)?

The Safe Node approach would work for both the "bite-size" and
"meal-size" cases.

> I think the bite-size case is arguably best-addressed by an HTML sanitizer using a white-list.
There are at least three advantages that Safe Node has relative to a
sanitizer, detailed in the original writeup:
1) Elimination of sanitization complexity.
2) Integration of policy enforcement with the actual implementation of
what's being regulated.
3) It’s easier for a Safe Node to safely handle CSS than it would be
for a client-side sanitizer.
As discussed the Safe Node policy enforcement is quite different from
policing a list of acceptable elements, attributes, etc. as with a
sanitizer.  Adding the sanitizer into the browser does centralize that
policy enforcement, but it's still burdensome to update.  It's also
harder to build a new browser feature that subverts Safe Node than it
would be to build a new browser feature that subverts sanitization.

Yes I think we could build Safe Node on top of sanitization, and that
may provide a nice API for developers, but it doesn't provide the
three benefits described above.

The discussion of cognitive load is well founded.  But while IFRAMEs
are the model developers use today, using them properly in an
application for the purpose of isolation is a product feature unto
itself.  Developers are pretty much forced to use them given it's one
of the only ways to achieve the stated goal.  Contrast that with the
Safe Node approach where a developer just creates new nodes with a
"safety" attribute on them and then places them into the DOM.  It's
hard to mess that up, and enabling new capabilities (e.g.: allowing
external content) is as simple as adjusting the safety attribute to
include a flag.  Given the choice between Safe Node and frame-based
isolation, I would expect Safe Node would win even among developers
who already grok frame-based isolation -- they know firsthand how much
of a pain it is to implement good frame-based isolation!

Dave

On Mon, Jan 25, 2016 at 2:41 PM, Andrew Sutherland
<asutherland@asutherland.org> wrote:
> On Mon, Jan 25, 2016, at 03:05 AM, David Ross wrote:
>> IFRAMEs require significant engineering work to implement in many
>> cases, and aren't particularly flexible in some ways.
>
> IFRAMEs can absolutely be a hassle.  And I like the feature set you
> propose and the goal of making it easier to safely include untrusted
> content into the page.  If the right way is hard/impossible, many will
> just find something that looks like magic pixie dust, sprinkle it
> around, and call it a day.
>
> My primary concern is about cognitive burden.  IFRAMEs are a known
> concept.  They separate conceptually separate content into separate
> documents with separate globals.  If I
> querySelectorAll()/getElementById() in my main page, I know that I'm not
> accidentally going to pull out an attacker-provided element.  As
> currently proposed, it sounds like everyone hacking on a page using
> SafeNode would need to be aware of this.  Now, obviously,
> querySelector*/etc. can be fixed by spec, but then what has been created
> sounds a lot like the Shadow DOM.  And then the question is why aren't
> we just using the Shadow DOM instead of inventing new things?
>
>
> The reason I brought up use-cases is because, to my mind/experience,
> there's basically two main scenarios with user/third-party-authored
> content:
>
> 1) Bite-size: Rich-text mark-up for text messages, tweets, forum posts,
> etc.  Markup exists for some combination of user self-expression and
> semantics (quoting, MathML, etc.).
>
> 2) Meal-size: The user-content is a document in its own right.  For
> example HTML mails with ads and custom styling.
>
> Reading between the lines of what you are saying about IFRAMEs, it
> sounds like the use-case you have in mind is closer to "bite-size", but
> you are engineering functionality for "meal-size".  I think the
> bite-size case is arguably best-addressed by an HTML sanitizer using a
> white-list.  Safety can be provided/guaranteed and you get all of your
> non-rectangular styling goals, etc.  And I think the meal-size case
> should be addressed with the existing platform solution, IFRAMEs.
>
> And meal-size wise, it's worth calling out that IFRAMEs can be important
> for usability given the rise of mobile devices and the need to
> pinch-and-zoom.  If the document has its own idea of what a reasonable
> font-size is, users may absolutely need to zoom the document.  While a
> "transform: scale" hack can be applied at any level of the DOM, arguably
> it's cleaner and there can be more benefits to be doing it in an iframe
> in terms of allowing the browser engine to understand and optimize
> what's going on[1].
>
>> In contrast, it should be possible to create Safe Nodes on-the-fly
>> whenever you want to safely render a little snippet of untrusted
>> markup.  You mostly don't need to think through all these
>> considerations, just create a Safe Node and add it into the DOM.
>> Configuration options are available, but they're tuned to the use
>> cases you might care about and they don't create much of an
>> opportunity for shooting yourself in the foot.
>
> This seems like a great proposal for a custom-element implementation
> using existing libraries or new libraries that could be built out.
> Easy-to-use safety with the existing platform implementation, webdevs
> are saved from additional cognitive burden of new possible platform
> magic behavior.
>
> For example <SafeNode><template><muahaha>I better have been injected via
> createElement or realllly safely quoted into the
> document.</muahaha></template></SafeNode>.
>
> SafeNode could even try and be all things to all people by having the
> sanitizer library do a speculative sanitizing pass.  If it encounters
> simple markup meeting all whitelist requirements, it can perhaps simply
> inject the elements directly into the document.  If more extensive
> features are used it can place the content in a shadow DOM, perhaps
> slurping an explicit scoped stylesheet into place or something.  And if
> it encounters something potentially dangerous but allowed, it spins up
> an iframe.  If the SafeNode creator knows they want iframes, they can
> specify that as an attribute and skip directly to that.
>
>> Though I would agree that in a world with seamless IFRAMEs, it should
>> be possible to build a decent Safe Node polyfill.
>
> Definitely agreed if s/polyfill/library/.  As in, I think it makes sense
> to build the use-case solution first and only propose platform changes
> if/when it proves reasonable/necessary.  And it's my assertion that the
> platform changes can likely be made to existing web platform features.
> But again, you may have use-cases in mind that I'm not conceiving of.
>
> Andrew
>
> 1: This is wildly speculative on my part other than the fact that meta
> viewport is a thing
> (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta#Attributes).
>  The most recent time we rev'ed the Firefox OS mail app's HTML display,
> Gecko only supported Asynchronous-Pan-and-Zoom (APZ) at the top level
> (eg the email app itself), so we were faking things by using "transform:
> scale(blah)" on our body iframes to zoom the HTML mail as part of the
> enclosing page context.  (So no scrolling inside the iframe, the
> iframe's effective size was always the same as its scroll size.)  The
> APZ/Compositor got very good at building and optimizing the layer tree
> so that only the visible portion of the iframe was rendered and
> painted... most of the time.  If we could have just tagged the iframe as
> <iframe zoomable-doc seamless /> then I think everyone would have been
> much happier.
>
Received on Tuesday, 26 January 2016 21:45:50 UTC