Concerns regarding cross-origin copy/paste security

There was a recent discussion involving directly exposing the HTML fragment
in a paste to a page, since we're doing the parsing anyway for security
reasons. I have some concerns regarding

>From my understanding, we are trying to protect against [1] hidden data
being copied without a user's knowledge and [2] XSS via pasting hostile
HTML. In my opinion, the algorithm as written is either going to remove too
much information or not enough. If it removes too much, the HTML paste is
effectively useless to a client app. If it doesn't remove enough, then the
client app is going to have to sanitize the HTML itself anyway.

I would argue that we should primarily be trying to prevent [1] and leave it
up to web pages to prevent [2]. [2] is no different than using data from any
other untrusted source, like dragging HTML or data from an XHR. It doesn't
make sense to special-case HTML pastes.

In order to achieve [1], the algorithm merely needs to be:
- Remove HTML comments, script, input type=hidden, and all other elements
that have no effect on layout (display: none). Possibly remove applet as
- Remove event handlers, data- and form action attributes.
- Blanking input type=password elements.

To me, it doesn't make sense to remove the other elements:
- OBJECT: Could be used for SVG as I understand.
- FORM: Essentially harmless once the action attribute is cleared.
- INPUT (non-hidden, non-password): Content is already available via
- TEXTAREA: See above.
- BUTTON, INPUT buttons: Most of the content is already available via
text/plain. We can scrub the value attribute if there is concern about that.

The draft also does not mention how EMBED elements should be handled.

If a script calls getData('text/html'), the implementation supports pasting
HTML, and the data available on the clipboard is from a different origin,
the implementation must sanitize the content by following these steps:
Should this sanitization be done during a copy as well to prevent data a
paste in a non-conforming browser from pasting unexpected things?


