[whatwg] registerProtocolHandler() whitelist from Ian Hickson on 2011-08-23 (public-whatwg-archive@w3.org from August 2011)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 23 Aug 2011 22:31:14 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1108232139160.32136@ps20323.dreamhostps.com>
On Tue, 12 Apr 2011, Lachlan Hunt wrote:
>
> We are investigating registerProtocolHandler and have been discussing the
> need for a blacklist of protocols to forbid.

I've added an open-ended whitelist to the spec.


On Tue, 12 Apr 2011, Wilhelm Joys Andersen wrote:
> 
> > * view-source: (Mozilla, Chrome)
> 
> This might have a valid use case in web-based editors like ACE:
> 
>  http://ajaxorg.github.com/ace/build/editor.html
> 
> (At the cost of not being able to edit and reload the cached version of 
> the page.)

I haven't added this one currently.


On Tue, 19 Apr 2011, Ojan Vafai wrote:
> On Tue, Apr 19, 2011 at 10:33 AM, Ian Hickson <ian at hixie.ch> wrote:
> >
> > I haven't updated the spec yet, but it strikes me that maybe what we 
> > should do instead is have a whitelist of protocols we definitely want 
> > to allow (e.g. mailto:), and define a common prefix for protocols that 
> > are used with this feature, in a similar way to how with XHR we've 
> > added Sec-* as a list of headers _not_ to support.
> >
> > So e.g. we could whitelist any protocol starting with "web+" and then 
> > register that as a common extension point for people inventing 
> > protocols for use with this feature, so that people writing OS-native 
> > apps would know that if they used a protocol with that prefix it's 
> > something that any web site could try to take over.
> >
> > I'd be curious about people's opinions on that matter.
> >
> > (If we did this, the whitelist may have to be updated occasionally to 
> > add new protocols that people invented that we think are fine to be 
> > overridden, but that are not "web+"-prefixed.)
> 
> This seems like the right approach. Even if we blacklist correctly now, 
> needing to remember to blacklist each new protocol is too risky. A 
> whitelist somewhat limits the potential for people using 
> registerProtocolHandler in unexpected useful ways, but it still meets 
> the primary use cases.

I've done this.


On Tue, 19 Apr 2011, Lachlan Hunt wrote:
> 
> Other protocols we should probably also whitelist:
> 
> irc, sms, smsto, tel.

I've added these.


> I'm also curious how we could handle ISBN URNs, like:
> 
>   urn:isbn:0-395-36341-1
> 
> That would be useful to have a web service that could look up the ISBN 
> and direct users to information about the book, or to an online store.
> 
> As currently specified, services have to register a handler for "urn", 
> even if they only handle ISBN URNs.  The other alternative would be to 
> mint a new web+isbn scheme, which seems suboptimal.

I've added urn:. I don't think it makes much sense to have a scheme that 
has a second-level registration scheme, why not just have top-level 
schemes? So I haven't done anything special for urn:.


On Wed, 20 Apr 2011, Brett Zamir wrote:
> 
> Maybe registerProtocolHandler() could take a function as an extra 
> argument to let the application determine whether it wishes to handle 
> the protocol event, internally using e.preventDefault(), 
> e.stopPropagation(), or something similar to indicate that it has 
> successfully handled the case, and pass the buck to let other protocol 
> handlers be checked in order of user preference otherwise.

Having script contexts survive the browser session would be really weird.


> Now that it seems there is momentum on resolving the URN and custom 
> (pseudo-)namespacing issue (I think "x-" might be nice to continue the 
> tradition, though "web" seems fine also if real namespaces will not be 
> used), can we please put back on the table the ideas of:
> 
> 1) adding to <a/> an attribute "uris" (for trying alternatives first, with
> greater precedence than "href")
>
> 2) adding to <a/> an attribute "fallbackURIs" (for lesser precedence 
> than "href", e.g., so a browser might expose these URIs only when the 
> link was right-clicked)
>
> 3) adding an event to listen for the user refusing or the browser not 
> supporting a protocol (even if this can be done with try-catches).
> 
> ...so that people can actually begin experimenting with 
> registerProtocolHandler() rather than expecting content authors to make 
> links which may lead to nowhere for some of their users?

Can you elaborate on the concrete use cases you think would need this?


On Fri, 22 Apr 2011, Michael A. Puls II wrote:
> 
> Besides mailto, these should be white-listed:
> 
> mms
> nntp

Added.


> rtsp

I haven't added this one. It's not clear what it would mean for a Web app 
to hook into this one. Surely browsers should just support it natively?


On Fri, 22 Apr 2011, timeless wrote:
> 
> news

Added.


On Fri, 3 Jun 2011, James Kozianski wrote:
>
> webcal should also be whitelisted.

Added.


On Tue, 19 Apr 2011, Wilhelm Joys Andersen wrote:
> 
> When playing with registerProtocolHandler() last week, I noticed that 
> the following constructs are possible:
> 
>  navigator.registerProtocolHandler("mail.google.com",
>    "http://evilsite.tld/%s", "Google Mail");
> 
>  navigator.registerProtocolHandler("192.168.1.1",
>    "http://evilsite.tld/%s", "D-Link Wireless Router");
> 
> According to the URI spec[1], both "mail.google.com" and
> "192.168.1.1" are valid URL schemes:
> 
>   "Scheme names consist of a sequence of characters beginning
>   with a letter and followed by any combination of letters,
>   digits, plus ("+"), period ("."), or hyphen ("-")."
> 
> After running the lines of script above, typing any of the
> following URLs will lead the user to evilsite.tld:
> 
>   mail.google.com:80/mail/
>   192.168.1.1:80
> 
> The use of confusing URLs to trick the user into visiting a malicious 
> site is nothing new. The difference this time is that the URLs above 
> would trick even me, and I'm not particularly prone to phishing.
> 
> Using this technique to trick users would require an attacker to bypass 
> two obstacles:
> 
>  * To permanently add "mail.google.com" as a scheme pointing to
>    evilsite.tld, user approval in two separate dialogs is
>    required in both Firefox and my internal Opera build.
> 
>  * If the user's web browser keeps the address field visible
>    at all times, the user may notice that they are redirected
>    to evilsite.tld once the URL has been interpreted by the
>    browser.
> 
> Despite this, we would prefer to err on the side of caution here. Our 
> experience with other warning dialogs indicate that users don't 
> necessarily read or understand what they approve, and phishing schemes 
> with far cruder URLs (paypal.com.evilsite.com) succeed surprisingly 
> often.
> 
> To save ourselves (and our users) from possible future headaches, we 
> have decided to disallow the use of dots in the protocol argument of 
> registerProtocolHandler().
> 
> Of the IANA-registered URL schemes[2], only the following contain dots:
> 
>  iris.beep
>  iris.xpc
>  iris.xpcs
>  iris.lws
>  soap.beep
>  soap.beeps
>  xmlrpc.beep
>  xmlrpc.beep
>  z39.50r
>  z39.50s
> 
> I don't see any clear use cases for overriding any of the above in a web 
> browser.
> 
> Opera will still interpret URLs in accordance with the URI spec, but 
> registerProtocolHandler() may only override the subset of URL schemes 
> containing alphanumeric characters, "+" and "-".
> 
> I suggest the same restriction is added to the HTML specification.

Done. Only specific whitelisted values plus those consisting of web+ 
followed by characters a-z only are supported now.


On Thu, 21 Apr 2011, Aryeh Gregor wrote:
>
> It was pointed out on IRC 
> <http://krijnhoetmer.nl/irc-logs/whatwg/20110415#l-734> that it would 
> make sense to also ban the string "localhost", as the only common domain 
> name that contains no dots.

I have made sure it is not whitelisted.


On Thu, 9 Jun 2011, rektide wrote:
>
> I just got wind of [...] Hixie's comments in reply to a thread on 
> blacklists for registerProtocolHandler[1].  In it, he proposes a 
> whitelist of /^web\+.[:somethingorother:]+/.
> 
> Ian mentions 'that people writing OS-native apps would know that if they 
> used a protocol with that prefix it's something that any web site could 
> try to take over', but this has some issues:
> 
> 1. The current use case for registerProtocolHandler is intra-page.  For 
> one example, here's the MDC docs: "Note: Web sites may only register 
> protocol handlers for themselves. For security reasons, it's not 
> possible for an extension or web site to register protocol handlers 
> targeting other sites."
> 
> 2. Someone who wishes to register a 'web' protocol for their own usage 
> ought be forced to consider that this protocol may not necessarily 
> remain in their own purview.
> 
> 3. It forces syntactical cruft upon people wishing to exercise this 
> capability, and that cruft makes website handled protocols less likely 
> to be used, to look cheap, and to be regarded as second class citizen of 
> the protocol world.  Tim Bray has already lamented enforcing the // upon 
> the world, and if web+ protocols take off this will exacerbate his two 
> character mistake by another four oh-so-valuable characters.  We ought 
> not double the obvious + preventable mistakes of the past.
> 
> 4. Whitelisting seems fundamentally 'anti-web' by enforcing only what is 
> out there already.
> 
> I strongly support the notion that web pages ought be able to provide 
> their own content & protocol handlers ? especially in an OS native 
> fashion ? and it strikes me as unweildy to place this ^web\+[:soo:]+ 
> restriction on this extension point.

How do you propose to leave the mechanism open-ended yet secure if we do 
not have a whitelist with a well-known extension point like "web+"?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 23 August 2011 15:31:14 UTC