- From: Tyler Close <tyler.close@gmail.com>
- Date: Fri, 6 Apr 2012 15:01:32 -0700
On Fri, Apr 6, 2012 at 2:35 PM, Ian Hickson <ian at hixie.ch> wrote: > On Fri, 6 Apr 2012, Tyler Close wrote: >> On Mon, Apr 2, 2012 at 4:39 PM, Ian Hickson <ian at hixie.ch> wrote: >> > On Mon, 26 Sep 2011, Tyler Close wrote: >> >> >> >> I was recently experimenting with the registerProtocolHandler (RPH) >> >> API and came across a couple of security gotchas that make it hard to >> >> safely use the API. One of these is already known, but AFAICT, hasn't >> >> been fixed yet. I haven't seen the other discussed yet. >> >> >> >> The Mozilla blog post that introduces the registerProtocolHandler API >> >> makes use of window.parent.postMessage to send a response from the >> >> RPH handler back to the client page. >> > >> > I presume it uses this in conjunction with an <a href=""> link with a >> > target="" attribute to load the handler in an iframe. >> >> The client page loads the handler page using an iframe or a >> window.open(). Either can work. >> >> >> In the example code, the targetOrigin for this postMessage invocation >> >> is '*', while also noting that this is not secure. AFAICT, there is >> >> no API that the intent handler can reliably use to determine the >> >> correct targetOrigin for this postMessage invocation. >> > >> > How can the origin be anything other than the origin of the page that >> > triggered the link? >> >> Exactly, but we need a way for the handler page to find out what that >> origin is. >> >> A client page on origin A causes a navigation to a RPH URL (iframe or >> window.open). The browser loads the user chosen RPH handler, which is >> another web page from origin B. After the handler page loads, it wants >> to send a return value back to the client page. How does the handler >> page know the client page's origin is A? It needs to know this origin >> string so that it can securely use postMessage to send the return value >> back. AFAICT, there is no existing API in the browser that lets the >> handler page determine the client page's origin. > > Well if it's an iframe, the parent can't be anything but the original > origin, as far as I can tell. What happens if the handler sends the postMessage to "*", then the parent is navigated? Will the postMessage be delivered or not? > But in general, there's not expected to be any talking back. If you want > something where the handler talks back to the page that provided the data, > then you should use Web Intents. registerProtocolHandler() and > registerContentHandler() are intended for things like mail clients > (mailto:) or PDF viewers, which do not talk back. Indeed in the common use > case, you just click the link and the entire browsing context gets > replaced, so there's nothing to talk back _to_. I was prompted to write the original email by a Mozilla blog post that suggested talking back. It also seems bad for web APIs to break under simple composition like this; especially when there's an easy fix available. >> Currently, the handler page can only specify "*" in the postMessage >> invocation that sends the return value. If the client page is navigated >> by an attacker, before the postMessage is done, the attacker can >> intercept the return value. It's the same rationale used every time we >> advise programmers against using '*' as the targetOrigin for a >> postMessage() invocation. > > That rationale only applies when you're going from window to window, not > when you're going from iframe to parent. > > >> >> The second problem with RPH is that the handler page doesn't have a >> >> way of reliably getting the URL of the content to be handled from the >> >> browser. In order to work in offline scenarios, the RPH handler must >> >> put the %s placeholder in the fragment of its handler's URL. >> > >> > It's not clear to me that it makes sense to have an offline protocol >> > handler. What kind of protocol do you have in mind? >> >> For example, consider an offline web mail program. I click on a mailto: >> link and want to compose a message in my web mail editor, queuing it to >> be sent next time I'm online. >> >> RPH is a way for a web page to send data to a user determined >> application. There will surely be many scenarios where offline >> functionality is desirable. > > For such an example, you can just use a fallback section in the appcache > manifest. (Or a fragment identifier, indeed.) Right, the obvious thing to do is use the fragment identifier, but that's got some security problems. With a small tweak we can make this safe and easy. >> >> Unfortunately, this means that other content in the browser could >> >> modify the content URL before the handler reads it. >> > >> > Well, any content can load any URL, so it doesn't matter whether the >> > URL is in the fragment identifier or the path or anything else, >> > surely. >> >> It matters if the handler page assumes that the URL came from its parent >> or opener. The parent and opener then engage in a postMessage >> conversation where the parent knows it said one thing, but the handler >> heard it saying something different, something chosen by the attacker. > > Why would a mail client talk back to its opener? It might not, but some RPH handlers will. They've got a postMessage API; they're going to use it. Let's make sure its possible to use it safely. >> >> The intent handler sees a request coming from the victim page, but >> >> with a content URL specified by the attacker. A related problem is >> >> that the intent handler has no way to distinguish whether its URL was >> >> loaded via the browser's RPH handling, or whether the client page >> >> directly navigated to the intent handler's URL. Both of these >> >> problems could be fixed by adding another readonly DOMString to the >> >> API that contains the %s data for the RPH invocation. >> > >> > I don't understand why it matters how the URL was invoked. >> >> If the URL was invoked via RPH, then the handler page knows that the >> user selected it for this action. The handler page also knows that any >> arguments in the handler's URL (not in the RPH URL), were set by the >> handler's origin and were not tampered with by the client page. >> >> For example, a web mail program might have two registered RPH handlers >> for mailto: "https://example.org/?from=me at company&q=%s" and >> "https://example.org/?from=me at personal&q=%s". The user has configured >> their browser to send mailto links to their personal email editor. A >> malicious client page could directly open the URL for the company email >> editor. The web mail editor needs a way to detect when a client page is >> trying to subvert the user's chosen preferences. So, an RPH handler >> needs a way to know that it was loaded via the RPH dispatch. Once it >> knows this, it can also trust that the arguments in the URL, such as >> "from" in this case, were not tampered with by the client page. > > I don't understand the attack scenario. Sure, a Web page can open another > Web page with arbitrary arguments. Why does it matter here? Two reasons: 1. An RPH dispatch is different from a direct load because it communicates a user choice to the RPH handler. I explained above how a handler might use this information. 2. An RPH dispatch comes from the browser, so URL parameters can be trusted; whereas they cannot be trusted in a load from another web page. With a small change, we can prevent a client page from faking an RPH dispatch to a handler page. --Tyler
Received on Friday, 6 April 2012 15:01:32 UTC