Re: Concerns regarding cross-origin copy/paste security from Adam Barth on 2012-02-07 (public-webapps@w3.org from January to March 2012)

From: Adam Barth <w3c@adambarth.com>
Date: Tue, 7 Feb 2012 15:05:54 -0800
To: "Hallvord R. M. Steen" <hallvord@opera.com>
Cc: public-webapps <public-webapps@w3.org>, Daniel Cheng <dcheng@chromium.org>, Ryosuke Niwa <rniwa@webkit.org>
Message-ID: <CAJE5ia8Y_Yvu_4MiaYdUGVz3Z7FVUZv8=RerNCRu+zpqqCwzbw@mail.gmail.com>

On Mon, May 16, 2011 at 8:41 PM, Hallvord R. M. Steen
<hallvord@opera.com> wrote:
> On Thu, 05 May 2011 06:46:55 +0900, Daniel Cheng <dcheng@chromium.org>
> wrote:
>
>> There was a recent discussion involving directly exposing the HTML
>> fragment
>> in a paste to a page, since we're doing the parsing anyway for security
>> reasons. I have some concerns regarding
>>
>> http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-code
>> though.
>>
>>> From my understanding, we are trying to protect against [1] hidden data
>>
>> being copied without a user's knowledge and [2] XSS via pasting hostile
>> HTML. In my opinion, the algorithm as written is either going to remove
>> too
>> much information or not enough. If it removes too much, the HTML paste is
>> effectively useless to a client app. If it doesn't remove enough, then the
>> client app is going to have to sanitize the HTML itself anyway.
>
> FWIW, my main concern was the hidden data aspect because it can be abused
> for cross-site request forgery if a malicious site by getting the user to
> copy and paste gets access to form anti-CSRF tokens and such.

That's certainly possible, but I don't think it's possible for us to
protect against the long tail of risks here.  In these sorts of cases,
it can be better for security to not implement a half-correct solution
and instead decide not to try to mitigate a particular risk.

> I *intend* to
> leave some processing of the HTML to the client application, for example the
> removal of third-party application-specific or browser-specific CSS
> properties.
>
> I see that Chrome applies different security policies depending on whether
> the content is read by a JavaScript (getData('text/html') - style) and
> inserted directly. You do some extra work to avoid XSS, such as removing on*
> event listener attributes and href=javascript: when content is inserted
> directly (you also remove some browser-specific elements and class names).
> This sort of clean up and processing on direct data insertion by the
> user-agent is not really in scope for the events spec IMO.

That makes sense.  The risk here is somewhat different from what
you've articulated above.  Rather than trying to prevent information
leaks from the "source" of the copy to the "target" of the paste,
these checks aim to prevent the source from injecting script into the
target.

> However, for getData('text/html') it seems you do no clean-up at all, not
> for cross-origin paste either.

Correct.  The idea here is to have a secure default but still let a
sophisticated web application handle the complicated cases if they
want to.  I just spoke with Ryosuke and Daniel, and we're considering
tightening up the default behavior somewhat to prevent injections of
<style> and other dangerous elements (probably by switching to a
whitelist).

> Implementing the current spec would thus
> require that you tighten your existing security policy. Will you consider
> doing so, or would you rather argue for removal of any spec-mandated
> clean-up of cross-origin source code?

IMHO, we shouldn't try to protect the "source" of the data, but we
should aim to protect the "target".  My understanding of your message
is that would cause us to remove the text in this spec.  If we find a
good whitelist for protecting the target, that's probably worth
writing in a spec so that browsers can interoperate, but it doesn't
have to be this spec if you feel that this behavior is out of scope.

>> [2] is no different than using data from any
>> other untrusted source, like dragging HTML or data from an XHR. It doesn't
>> make sense to special-case HTML pastes.
>
> "Using data" is not the only threat model - limiting the damage potential
> when the page you paste into is malicious is harder. However, there is some
> overlap in the strategies we might use - for example event attributes are
> certainly hidden data, might contain secrets and might cause XSS attacks so
> you might argue for their removal based on both abuse scenarios though I
> think [2] is a more relevant threat.

The problem is that the tail of where sensitive information might
reside is long and thick, making these security measures only
partially effective, at best.

Adam

Received on Tuesday, 7 February 2012 23:10:09 UTC