Re: Concerns regarding cross-origin copy/paste security from Ryosuke Niwa on 2011-05-12 (public-webapps@w3.org from April to June 2011)

From: Ryosuke Niwa <rniwa@webkit.org>
Date: Wed, 11 May 2011 19:01:42 -0700
To: Daniel Cheng <dcheng@chromium.org>
Cc: public-webapps <public-webapps@w3.org>, "Hallvord R. M. Steen" <hallvord@opera.com>
Message-ID: <BANLkTikx_qBFBuaW9t74Of3yBfo__tLDSA@mail.gmail.com>

On Wed, May 4, 2011 at 2:46 PM, Daniel Cheng <dcheng@chromium.org> wrote:
>
> From my understanding, we are trying to protect against [1] hidden data
> being copied without a user's knowledge and [2] XSS via pasting hostile
> HTML. In my opinion, the algorithm as written is either going to remove too
> much information or not enough. If it removes too much, the HTML paste is
> effectively useless to a client app. If it doesn't remove enough, then the
> client app is going to have to sanitize the HTML itself anyway.
>
> I would argue that we should primarily be trying to prevent [1] and leave
> it up to web pages to prevent [2]. [2] is no different than using data from
> any other untrusted source, like dragging HTML or data from an XHR. It
> doesn't make sense to special-case HTML pastes.
>

However, fragment parsing algorithm as spec'ed in HTML5 already prevents
[2].  It removes event handler, script element, etc...

To me, it doesn't make sense to remove the other elements:
> - OBJECT: Could be used for SVG as I understand.
> - FORM: Essentially harmless once the action attribute is cleared.
> - INPUT (non-hidden, non-password): Content is already available via
> text/plain.
> - TEXTAREA: See above.
> - BUTTON, INPUT buttons: Most of the content is already available via
> text/plain. We can scrub the value attribute if there is concern about that.
> - SELECT/OPTION/OPTGROUP: See above.
>

I'm also curious as to why these elements are being removed.  Hallvord?

 Should this sanitization be done during a copy as well to prevent data a
> paste in a non-conforming browser from pasting unexpected things?
>

We already do some of this stuff in WebKit.  For example, we avoid
serializing non-rendered contents.

- Ryosuke

Received on Thursday, 12 May 2011 02:10:41 UTC