W3C home > Mailing lists > Public > public-webapps@w3.org > April to June 2011

Re: Concerns regarding cross-origin copy/paste security

From: Ryosuke Niwa <rniwa@webkit.org>
Date: Wed, 11 May 2011 19:01:42 -0700
Message-ID: <BANLkTikx_qBFBuaW9t74Of3yBfo__tLDSA@mail.gmail.com>
To: Daniel Cheng <dcheng@chromium.org>
Cc: public-webapps <public-webapps@w3.org>, "Hallvord R. M. Steen" <hallvord@opera.com>
On Wed, May 4, 2011 at 2:46 PM, Daniel Cheng <dcheng@chromium.org> wrote:
> From my understanding, we are trying to protect against [1] hidden data
> being copied without a user's knowledge and [2] XSS via pasting hostile
> HTML. In my opinion, the algorithm as written is either going to remove too
> much information or not enough. If it removes too much, the HTML paste is
> effectively useless to a client app. If it doesn't remove enough, then the
> client app is going to have to sanitize the HTML itself anyway.
> I would argue that we should primarily be trying to prevent [1] and leave
> it up to web pages to prevent [2]. [2] is no different than using data from
> any other untrusted source, like dragging HTML or data from an XHR. It
> doesn't make sense to special-case HTML pastes.

However, fragment parsing algorithm as spec'ed in HTML5 already prevents
[2].  It removes event handler, script element, etc...

To me, it doesn't make sense to remove the other elements:
> - OBJECT: Could be used for SVG as I understand.
> - FORM: Essentially harmless once the action attribute is cleared.
> - INPUT (non-hidden, non-password): Content is already available via
> text/plain.
> - TEXTAREA: See above.
> - BUTTON, INPUT buttons: Most of the content is already available via
> text/plain. We can scrub the value attribute if there is concern about that.

I'm also curious as to why these elements are being removed.  Hallvord?

 Should this sanitization be done during a copy as well to prevent data a
> paste in a non-conforming browser from pasting unexpected things?

We already do some of this stuff in WebKit.  For example, we avoid
serializing non-rendered contents.

- Ryosuke
Received on Thursday, 12 May 2011 02:10:41 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 October 2017 07:26:31 UTC