Re: In-browser sanitization vs. a “Safe Node” in the DOM from David Ross on 2016-01-26 (public-webappsec@w3.org from January 2016)

From: David Ross <drx@google.com>
Date: Tue, 26 Jan 2016 13:53:38 -0800
To: Andrew Sutherland <asutherland@asutherland.org>
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAMM+ux40un8_WpiideFLqQ_q=N+pRCd=KYfhvsEpwpLhPmTvEA@mail.gmail.com>
(For anyone who hasn't been following this thread too closely...)

Taking a step back from the discussion of pros / cons of the various
ideas, I think it's fair to say there are now three distinct ideas
that have been presented:

1)  Sanitization baked into the browser
2)  Safe Node
3)  Make IFRAME-based isolation work (seamless++)

Dave




On Tue, Jan 26, 2016 at 1:45 PM, David Ross <drx@google.com> wrote:
>> If I querySelectorAll()/getElementById() in my main page, I know that
>> I'm not accidentally going to pull out an attacker-provided element.
> I wonder what portion of accesses like these are going to be safe vs.
> unsafe.  Maybe it makes sense to enable the use of shadow DOM on the
> Safe Node by default (with the option to opt-out)?
>
> The Safe Node approach would work for both the "bite-size" and
> "meal-size" cases.
>
>> I think the bite-size case is arguably best-addressed by an HTML sanitizer using a white-list.
> There are at least three advantages that Safe Node has relative to a
> sanitizer, detailed in the original writeup:
> 1) Elimination of sanitization complexity.
> 2) Integration of policy enforcement with the actual implementation of
> what's being regulated.
> 3) It’s easier for a Safe Node to safely handle CSS than it would be
> for a client-side sanitizer.
> As discussed the Safe Node policy enforcement is quite different from
> policing a list of acceptable elements, attributes, etc. as with a
> sanitizer.  Adding the sanitizer into the browser does centralize that
> policy enforcement, but it's still burdensome to update.  It's also
> harder to build a new browser feature that subverts Safe Node than it
> would be to build a new browser feature that subverts sanitization.
>
> Yes I think we could build Safe Node on top of sanitization, and that
> may provide a nice API for developers, but it doesn't provide the
> three benefits described above.
>
> The discussion of cognitive load is well founded.  But while IFRAMEs
> are the model developers use today, using them properly in an
> application for the purpose of isolation is a product feature unto
> itself.  Developers are pretty much forced to use them given it's one
> of the only ways to achieve the stated goal.  Contrast that with the
> Safe Node approach where a developer just creates new nodes with a
> "safety" attribute on them and then places them into the DOM.  It's
> hard to mess that up, and enabling new capabilities (e.g.: allowing
> external content) is as simple as adjusting the safety attribute to
> include a flag.  Given the choice between Safe Node and frame-based
> isolation, I would expect Safe Node would win even among developers
> who already grok frame-based isolation -- they know firsthand how much
> of a pain it is to implement good frame-based isolation!
>
> Dave
>
> On Mon, Jan 25, 2016 at 2:41 PM, Andrew Sutherland
> <asutherland@asutherland.org> wrote:
>> On Mon, Jan 25, 2016, at 03:05 AM, David Ross wrote:
>>> IFRAMEs require significant engineering work to implement in many
>>> cases, and aren't particularly flexible in some ways.
>>
>> IFRAMEs can absolutely be a hassle.  And I like the feature set you
>> propose and the goal of making it easier to safely include untrusted
>> content into the page.  If the right way is hard/impossible, many will
>> just find something that looks like magic pixie dust, sprinkle it
>> around, and call it a day.
>>
>> My primary concern is about cognitive burden.  IFRAMEs are a known
>> concept.  They separate conceptually separate content into separate
>> documents with separate globals.  If I
>> querySelectorAll()/getElementById() in my main page, I know that I'm not
>> accidentally going to pull out an attacker-provided element.  As
>> currently proposed, it sounds like everyone hacking on a page using
>> SafeNode would need to be aware of this.  Now, obviously,
>> querySelector*/etc. can be fixed by spec, but then what has been created
>> sounds a lot like the Shadow DOM.  And then the question is why aren't
>> we just using the Shadow DOM instead of inventing new things?
>>
>>
>> The reason I brought up use-cases is because, to my mind/experience,
>> there's basically two main scenarios with user/third-party-authored
>> content:
>>
>> 1) Bite-size: Rich-text mark-up for text messages, tweets, forum posts,
>> etc.  Markup exists for some combination of user self-expression and
>> semantics (quoting, MathML, etc.).
>>
>> 2) Meal-size: The user-content is a document in its own right.  For
>> example HTML mails with ads and custom styling.
>>
>> Reading between the lines of what you are saying about IFRAMEs, it
>> sounds like the use-case you have in mind is closer to "bite-size", but
>> you are engineering functionality for "meal-size".  I think the
>> bite-size case is arguably best-addressed by an HTML sanitizer using a
>> white-list.  Safety can be provided/guaranteed and you get all of your
>> non-rectangular styling goals, etc.  And I think the meal-size case
>> should be addressed with the existing platform solution, IFRAMEs.
>>
>> And meal-size wise, it's worth calling out that IFRAMEs can be important
>> for usability given the rise of mobile devices and the need to
>> pinch-and-zoom.  If the document has its own idea of what a reasonable
>> font-size is, users may absolutely need to zoom the document.  While a
>> "transform: scale" hack can be applied at any level of the DOM, arguably
>> it's cleaner and there can be more benefits to be doing it in an iframe
>> in terms of allowing the browser engine to understand and optimize
>> what's going on[1].
>>
>>> In contrast, it should be possible to create Safe Nodes on-the-fly
>>> whenever you want to safely render a little snippet of untrusted
>>> markup.  You mostly don't need to think through all these
>>> considerations, just create a Safe Node and add it into the DOM.
>>> Configuration options are available, but they're tuned to the use
>>> cases you might care about and they don't create much of an
>>> opportunity for shooting yourself in the foot.
>>
>> This seems like a great proposal for a custom-element implementation
>> using existing libraries or new libraries that could be built out.
>> Easy-to-use safety with the existing platform implementation, webdevs
>> are saved from additional cognitive burden of new possible platform
>> magic behavior.
>>
>> For example <SafeNode><template><muahaha>I better have been injected via
>> createElement or realllly safely quoted into the
>> document.</muahaha></template></SafeNode>.
>>
>> SafeNode could even try and be all things to all people by having the
>> sanitizer library do a speculative sanitizing pass.  If it encounters
>> simple markup meeting all whitelist requirements, it can perhaps simply
>> inject the elements directly into the document.  If more extensive
>> features are used it can place the content in a shadow DOM, perhaps
>> slurping an explicit scoped stylesheet into place or something.  And if
>> it encounters something potentially dangerous but allowed, it spins up
>> an iframe.  If the SafeNode creator knows they want iframes, they can
>> specify that as an attribute and skip directly to that.
>>
>>> Though I would agree that in a world with seamless IFRAMEs, it should
>>> be possible to build a decent Safe Node polyfill.
>>
>> Definitely agreed if s/polyfill/library/.  As in, I think it makes sense
>> to build the use-case solution first and only propose platform changes
>> if/when it proves reasonable/necessary.  And it's my assertion that the
>> platform changes can likely be made to existing web platform features.
>> But again, you may have use-cases in mind that I'm not conceiving of.
>>
>> Andrew
>>
>> 1: This is wildly speculative on my part other than the fact that meta
>> viewport is a thing
>> (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta#Attributes).
>>  The most recent time we rev'ed the Firefox OS mail app's HTML display,
>> Gecko only supported Asynchronous-Pan-and-Zoom (APZ) at the top level
>> (eg the email app itself), so we were faking things by using "transform:
>> scale(blah)" on our body iframes to zoom the HTML mail as part of the
>> enclosing page context.  (So no scrolling inside the iframe, the
>> iframe's effective size was always the same as its scroll size.)  The
>> APZ/Compositor got very good at building and optimizing the layer tree
>> so that only the visible portion of the iframe was rendered and
>> painted... most of the time.  If we could have just tagged the iframe as
>> <iframe zoomable-doc seamless /> then I think everyone would have been
>> much happier.
>>
Received on Tuesday, 26 January 2016 21:54:29 UTC