Re: HTML5 proposes introduction of new family of URI schemes from Robin Berjon on 2012-01-23 (www-tag@w3.org from January 2012)

From: Robin Berjon <robin@berjon.com>
Date: Mon, 23 Jan 2012 23:40:49 +0100
To: Noah Mendelsohn <nrm@arcanedomain.com>
Cc: "www-tag@w3.org List" <www-tag@w3.org>, Paul Cotton <Paul.Cotton@microsoft.com>, Maciej Stachowiak <mjs@apple.com>, Sam Ruby <rubys@intertwingly.net>
Message-Id: <5280799A-55C8-4818-B130-AB4488007B54@berjon.com>

On Jan 19, 2012, at 20:37 , Noah Mendelsohn wrote:
> On 1/19/2012 11:41 AM, Robin Berjon wrote:
>> Well except that there are quite a few developers, yours truly included,
>> who really want this functionality. In fact, as indicated earlier, I
>> find that some parts of it don't go anywhere near far enough.
> 
> Robin: I'd be grateful if you could explain why using URI templates, or something similar, to pattern match on existing URIs isn't preferable to matching on URIs matching a special "web+" pattern.

I'd like to put some context on that snippet I wrote which you quote. The functionality I speak of is that of registerProtocolHandler(). To be entirely clear, that method is *not* limited to URIs with a "web+" prefixed scheme. It's a method that registers a given URI as a handler for a given scheme, and that scheme can (as currently specified) either be one from a whitelist (that includes the likes of mailto, irc, etc.) or any scheme that starts with "web+".

The overall functionality is important if we are to match native functionality on the Web. But as with all such bridges to native, it's important that it has a security model. Obviously, if you could trick users into having your page handle all of their traffic for http (or https) links, then you'd have a very nasty exploit handy. Even without picking such a caricatural example, you could for instance register yourself for the "itms:" scheme (which points to the iTunes Music Store and IIRC was the infamous origin of the Don't Mint Schemes Willy-Nilly principle), make a page that looks like iTunes, and ask for credit card numbers.

I suspect that a whitelist was selected because it is more conservative security-wise than a blacklist, and because there are probably quite a few non-registered schemes out there (like itms:) that could provide good attack vectors if a blacklist were used. But the problem with whitelists is their lack of evolvability. If you have a super cool new protocol, which is fully warranted (as per AWWW) in using a new scheme, you're not going to be able to use a Web handler for it if it's not in the whitelist. And waiting for the whitelist to be updated means jumping through standards hoops, waiting for implementations to catch up, etc. — in other words it would take years. If instead you could easily indicate that your new scheme is "Safe for the Web™" then you can hit the ground running: anyone who's visited your website (or that of someone else implementing the new protocol) can start using it immediately. That's what "web+" is for.

My concerns with that approach are that how to decide that a scheme is web-safe is undocumented, and that I'm unconvinced that making that call at scheme-minting time is a viable approach. See the bullet points in http://lists.w3.org/Archives/Public/www-tag/2012Jan/0088.html for the longer version.

>  The drawbacks to web+ seem to include:
> 
> * Encourages or even requires people to use new schemes, when other schemes might otherwise have been applicable (seems to be at odds with the admonition in AWWW that creation of new URI schemes is strongly discouraged [1]).

No, I don't think that anything changes here. Existing schemes work with RPH perfectly fine (if whitelisted of course). There is no more reason to create new schemes than there was before and the AWWW advice sticks. All that changes is that *when* it is legitimate to mint a new scheme, as happens once in a while, then you have a way of indicating that it's safe to be handled by a Web page.

> * Seems to put the decision as to what client will be used in the wrong place, I.e. with the person or organization that coins the identifier. It should IMO generally be possible to have both Web and native apps handle a given identifier, to change one's mind after the fact, etc. If documents are full of links to "web+xxx:....." URIs, then lots of existing mechanism on the Web doesn't work with them (useless in agents that don't know of the new scheme), and you've committed to a naming convention just because, at this point in time, you think people will be using Web-based implementations.

No, that's also a misperception. Just because your scheme uses a web+ prefix does not mean that it can't be handled by native apps. It just means that it's labelled "safe" for handling by a Web page.

Does this make things clearer?

-- 
Robin Berjon - http://berjon.com/ - @robinberjon

Received on Monday, 23 January 2012 22:41:46 UTC