[whatwg] validate attribute in <A>

On Nov 3, 2007 10:55 AM, Adam Barth <hk9565 at gmail.com> wrote:
> On Nov 3, 2007 2:31 AM, Ian Hickson <ian at hixie.ch> wrote:
> > On Wed, 25 Jan 2006, Mike Hoye wrote:
> > > The validate attribute would describe an algorithm to employ and a
> > > result to compare it to; for example, somebody downloading the en-US
> > > version of FF 1.5 from the Mozilla.com homepage could click on a link
> > > like
> > >
> > > <a href="http://foo.com/mozilla-i686.tgz"
> > >    validate="{md5}b63fcdf4863e59c93d2a29df853b6046">
> > >
> > > and the client could verify as it comes in that it does at least have
> > > the md5sum that's advertised.  User notifications could include "no
> > > validation", "successfully validated" and "failed validation", and act
> > > according to the user's wishes in each case.
> >
> > It's not entirely clear to me what problem this is solving; but wouldn't
> > content-MD5 (RFC 1864) be a better solution?
>
> One scenario where something like this would be useful is for a site
> like eBay that serves iframes and img tags pointing to third-party
> content after reviewing that content for malware, scams, and adult
> content.  Without this mechanism, the content they review might change
> between the time they review it and the time their users load it.
>
> By specifying the hash of the content, they can ensure that the user
> agent loads exactly the content they reviewed.  (Of course, by
> ensuring that the content specifies the hashes of all content it
> loads, eBay can review all the content loaded by the iframe.)  Their
> alternative is to host all the content themselves, but this would
> require a large investment in server capacity as they reference a
> great deal of outside content in their item listings.

Another scenario where this would be very useful is for HTTPS sites.
Currently, every HTTPS site must host all of its content over HTTPS,
including script, style sheets, images, SWF movies, etc.  If the hosts
any of this content over HTTP, an active network attacker can replace
that content with his own.  Loading scripts, style sheets, and SWF
movies over HTTP is disaster as the attacker can inject his own
scripts and control the secure session.  Sadly, this greatly increases
the cost of serving an HTTPS site because these large objects must be
encrypted for each client and cannot be cached by user agents.

Fortunately, confidentiality is often not required for these embedded
objects.  The scripts and images are all publicly available.  What is
required, however, is integrity.  If the site can specify the hash of
these objects when embedding them over HTTP, integrity can be
guaranteed and the performance benefits of HTTP can be reaped.

Adam

Received on Saturday, 3 November 2007 11:23:48 UTC