Re: Couple comments on Subresource Integrity from Austin William Wright on 2014-03-26 (public-webappsec@w3.org from March 2014)

From: Austin William Wright <aaa@bzfx.net>
Date: Wed, 26 Mar 2014 00:21:59 -0700
To: Trevor Perrin <trevp@trevp.net>
Cc: Brad Hill <hillbrad@gmail.com>, Devdatta Akhawe <dev.akhawe@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CANkuk-V4zwNUou5UTEabfVd=XCzVACi5QvxQahqZ269-13=i+Q@mail.gmail.com>

On Tue, Mar 25, 2014 at 12:05 PM, Trevor Perrin <trevp@trevp.net> wrote:

>
> Devdatta wrote:
> >> The 6920 format adds verbosity, parsing, and having to read a 20-page
> >> (?!) doc.  What's the benefit?
> >
> > I am curious: are these the only concerns you have with using the RFC
> 6920?
>
> I also think you're assuming 6920 solves things it doesn't.
>
> So you probably need to think more about things like registering algo
> names, hash truncation, hash agility, content negotiation,
> canonicalization, etc.
>

If a server performs Content-Type negotiation, it should also send a
Content-Location header specifying where that _particular_ variant may be
retrieved in the future. I wouldn't worry about Content-Type negotiation.

>
>
> Devdatta wrote:
> > One benefit of having content type as separate meta-data is the
> > browser can send that and only that in the "accept" header.
>
> Thanks for explaining that.
>
> I'll have to think more about it.  Do you need the HTML to specify
> other content negotiation headers, like Accept-Language,
> Accept-Charset, Accept-Encoding, etc?
>

In the vocabulary of Mike Amundsen's H Factors, this would be "control
metadata for requests", which for some reason HTML doesn't really use,
except to the extent you can specify a "type" statement on link relations,
e.g.:

<link rel="stylesheet" href="theme.css" type="text/css" />

Which is just making the following link relations:

<currentDocument> stylesheet <theme.css> .
<theme.css> mediaType "text/css" .

Since all the statements are supposed to be consistent, I would expect the
user agent to bail if the declared media type and the served media type
didn't match. This isn't the reality, unfortunately, but it's the same idea
we appear to be going after for "integrity":

<currentDocument> stylesheet <theme.css> .
<theme.css> integrity <ni:///sha-256-32;f4OxZQ> .

And if the link relation isn't consistent with the observed resource, bail.
So for this reason, I would suggest defining the integrity value as a
link-extension parameter for purposes of RFC 5988 <
http://tools.ietf.org/html/rfc5988>.

I never saw a need for control data for requests, if the server ever wanted
a particular media type to be negotiated, you'd just link to that
particular variant (e.g. src="image.png" instead of src="image"). I imagine
you'd do the same for ni: URIs. The important thing to keep in mind is a
URI is just a name for a resource, the integrity property is presumably
ensuring that two URIs identify the same resource, one just defined in
terms of a cryptographic hash (in expected usage at least).

On a wild tangent, there's no reason you couldn't write a DHT that used
HTTP. Suppose you start with a random server:

[to 192.168.0.128]
GET ni://192.168.0.128/sha-256-32;f4OxZQ HTTP/1.1
Host:

302 Found
Location: ni://192.168.0.64/sha-256-32;f4OxZQ HTTP/1.1
Date: ...

[to 192.168.0.64]
GET ni://192.168.0.64/sha-256-32;f4OxZQ HTTP/1.1
Host:

200 OK
Date: ...
Content-Type: text/plain
Content-Length: xxx
Link: <ni:///sha-256;UyaQV-Ev4rdLoHyJJWCi11OHfrYv9E1aGQAlMO2X_-Q
>;rel=alternate;type=application/json
[response entity-body]

>
>
> On Tue, Mar 25, 2014 at 9:38 AM, Brad Hill <hillbrad@gmail.com> wrote:
> > Regarding explicit specification of content-type, consider if we
> eventually
> > decided to use the hash identifiers for retrieving things from some type
> of
> > content-addressable-storage.  The browser might have the correct bits
> laying
> > around, but they may have been delivered over a non-HTTP mechanism or the
> > original headers may not have been preserved.
> >
> > Using a standard format also is how we start to build an ecosystem around
> > these ideas.  In the future you might be able to use a local CDN that
> allows
> > ni:// urls to be used like magnet links, etc.
>
> Or you could do all this w/base64 hashes.  I'm not aware of anyone
> else using 6920, so I don't think there's an "ecosystem" to speak of.
>

Using URIs, like as above, allow you to request the resource from an HTTP
server. (An HTTP server can serve content for any URI, not just http/https
URLs.)

Among other cool things that I can potentially only dream of. The goal in
putting this into a separate specification is so that we do _not_ limit our
imaginations!

Austin Wright.

Received on Wednesday, 26 March 2014 07:22:32 UTC