- From: Hallvord R. M. Steen <hallvord@opera.com>
- Date: Tue, 17 May 2011 12:41:30 +0900
- To: public-webapps <public-webapps@w3.org>, "Daniel Cheng" <dcheng@chromium.org>
On Thu, 05 May 2011 06:46:55 +0900, Daniel Cheng <dcheng@chromium.org> wrote: > There was a recent discussion involving directly exposing the HTML > fragment > in a paste to a page, since we're doing the parsing anyway for security > reasons. I have some concerns regarding > http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-code > though. > >> From my understanding, we are trying to protect against [1] hidden data > being copied without a user's knowledge and [2] XSS via pasting hostile > HTML. In my opinion, the algorithm as written is either going to remove > too > much information or not enough. If it removes too much, the HTML paste is > effectively useless to a client app. If it doesn't remove enough, then > the > client app is going to have to sanitize the HTML itself anyway. FWIW, my main concern was the hidden data aspect because it can be abused for cross-site request forgery if a malicious site by getting the user to copy and paste gets access to form anti-CSRF tokens and such. I *intend* to leave some processing of the HTML to the client application, for example the removal of third-party application-specific or browser-specific CSS properties. I see that Chrome applies different security policies depending on whether the content is read by a JavaScript (getData('text/html') - style) and inserted directly. You do some extra work to avoid XSS, such as removing on* event listener attributes and href=javascript: when content is inserted directly (you also remove some browser-specific elements and class names). This sort of clean up and processing on direct data insertion by the user-agent is not really in scope for the events spec IMO. However, for getData('text/html') it seems you do no clean-up at all, not for cross-origin paste either. Implementing the current spec would thus require that you tighten your existing security policy. Will you consider doing so, or would you rather argue for removal of any spec-mandated clean-up of cross-origin source code? > I would argue that we should primarily be trying to prevent [1] and > leave it > up to web pages to prevent [2]. Chrome currently does neither for the getData() case - as far as I can tell. > [2] is no different than using data from any > other untrusted source, like dragging HTML or data from an XHR. It > doesn't > make sense to special-case HTML pastes. "Using data" is not the only threat model - limiting the damage potential when the page you paste into is malicious is harder. However, there is some overlap in the strategies we might use - for example event attributes are certainly hidden data, might contain secrets and might cause XSS attacks so you might argue for their removal based on both abuse scenarios though I think [2] is a more relevant threat. > In order to achieve [1], the algorithm merely needs to be: > - Remove HTML comments, script, input type=hidden, and all other elements > that have no effect on layout (display: none). Possibly remove applet as > well. > - Remove event handlers, data- and form action attributes. > - Blanking input type=password elements. So you still suggest removing event handlers even though this is primarily about your case [2]? > To me, it doesn't make sense to remove the other elements: > - OBJECT: Could be used for SVG as I understand. OBJECT is considered a form element, so it might have hidden data associated with it. It can also contain plugin content that could inject scripts and be used for XSS attacks. It may be too far-fetched or draconian to remove it though. (SVG is rich enough to be its own can of worms by the way..) > - FORM: Essentially harmless once the action attribute is cleared. Agree. I've changed the spec to allow FORM but remove @action. > - INPUT (non-hidden, non-password): Content is already available via > text/plain. An input's @name attribute is basically hidden data the user will not be aware of pasting. I'm not sure how much of a threat this is, but we should give it some thought. > - TEXTAREA: See above. Ditto :) > - BUTTON, INPUT buttons: Most of the content is already available via > text/plain. We can scrub the value attribute if there is concern about > that. More about @name regarding the principle of hidden data. However, I can easily be convinced that violating user expectations as little as possible is more important than taking this principle to its extreme consequences ;-) Perhaps other people would like to chime in here? > - SELECT/OPTION/OPTGROUP: See above. > > The draft also does not mention how EMBED elements should be handled. Any thoughts on this? >> Finally: >> If a script calls getData('text/html'), the implementation supports >> pasting >> HTML, and the data available on the clipboard is from a different >> origin, >> the implementation must sanitize the content by following these steps: > Should this sanitization be done during a copy as well to prevent data a > paste in a non-conforming browser from pasting unexpected things? No, I don't think so. If the content will be pasted into an application that doesn't support scripting and/or isn't from an untrusted origin, for example a typical desktop word processing app, the threats we are trying to handle don't really apply. -- Hallvord R. M. Steen, Core Tester, Opera Software http://www.opera.com http://my.opera.com/hallvors/
Received on Tuesday, 17 May 2011 03:42:09 UTC