- From: Daniel Cheng <dcheng@chromium.org>
- Date: Wed, 4 May 2011 14:46:55 -0700
- To: public-webapps <public-webapps@w3.org>
- Cc: "Hallvord R. M. Steen" <hallvord@opera.com>
- Message-ID: <BANLkTi=e3wogjNTQWfx+kRVqjmFRFewODQ@mail.gmail.com>
There was a recent discussion involving directly exposing the HTML fragment in a paste to a page, since we're doing the parsing anyway for security reasons. I have some concerns regarding http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-codethough. >From my understanding, we are trying to protect against [1] hidden data being copied without a user's knowledge and [2] XSS via pasting hostile HTML. In my opinion, the algorithm as written is either going to remove too much information or not enough. If it removes too much, the HTML paste is effectively useless to a client app. If it doesn't remove enough, then the client app is going to have to sanitize the HTML itself anyway. I would argue that we should primarily be trying to prevent [1] and leave it up to web pages to prevent [2]. [2] is no different than using data from any other untrusted source, like dragging HTML or data from an XHR. It doesn't make sense to special-case HTML pastes. In order to achieve [1], the algorithm merely needs to be: - Remove HTML comments, script, input type=hidden, and all other elements that have no effect on layout (display: none). Possibly remove applet as well. - Remove event handlers, data- and form action attributes. - Blanking input type=password elements. To me, it doesn't make sense to remove the other elements: - OBJECT: Could be used for SVG as I understand. - FORM: Essentially harmless once the action attribute is cleared. - INPUT (non-hidden, non-password): Content is already available via text/plain. - TEXTAREA: See above. - BUTTON, INPUT buttons: Most of the content is already available via text/plain. We can scrub the value attribute if there is concern about that. - SELECT/OPTION/OPTGROUP: See above. The draft also does not mention how EMBED elements should be handled. Finally: If a script calls getData('text/html'), the implementation supports pasting HTML, and the data available on the clipboard is from a different origin, the implementation must sanitize the content by following these steps: Should this sanitization be done during a copy as well to prevent data a paste in a non-conforming browser from pasting unexpected things? Daniel (resending from the right address, sorry for the spam Hallvord)
Received on Wednesday, 4 May 2011 21:47:19 UTC