[Integrity] Some comments on Subresource Integrity draft from Ángel González on 2014-09-15 (public-webappsec@w3.org from September 2014)

From: Ángel González <angel@16bits.net>
Date: Tue, 16 Sep 2014 01:42:38 +0200
To: public-webappsec@w3.org
Message-ID: <1410824558.1133.4.camel@16bits.net>

I provide below some comments to the SRI spec as of 15 September.


On section 3.2:
> Conformant user agents must support the SHA-256 and SHA-512
> cryptographic hash functions for use as part of a resource’s integrity
> metadata

add "(until they are determined to be insecure)" to make explicit that
the transient nature of hash support affects even SHA-256 and SHA-512.



On section 3.2.1 :
> In this case, the user agent will choose the strongest hash function
> in the list, and use that metadata to validate the resource 

I would add something like:
> When multiple integrity metadata is available, the user agent MAY
> choose to verify it with all the hashes it supports, and fail if ANY
> of them doesn't match the returned content.
[I am tempted to state this as a SHOULD. At least developer UAs should
be outputting console warnings for this, and the overhead seems
negligible for most UAs…]


The second step of 3.3.1 algorithm is a bit confusing. There's a typo
consumes → consume, but it may be clearer to end the phrase after
"applied" and explain the rest as a note.

It would be easier to deal with the -defined but unused-
Tranfer-Encoding (you never take that into account) instead of
Content-Encoding. I can think on a ni:// applying to
a .tar.gz, .tar.bz2, etc with different Content-Encodings. But then, how
to determine if the UA wanted to deal with the content-encodings applied
or not? Seems safer to state that if a Content-Encoding is provided, it
should be undone before hashing. If the intent is that the object is
used without undoing any Content-Encoding, then don't specify that
header (this is consistent with rfc7231, as in that case it is probably
an inherent encoding).


4.1 Caching Risks should document the privacy risk (history reading) of
loading by hash a resource used on a specific site.


There's an extra ) at 6.1


I feel 6.2 should be expanded. Perhaps by explaining the bad things that
could be done if finding a random collision.


The 6.3 Cross-origin data leakage scenario deals with leaks through
error event, however it seems that the same could be accomplished with
CSP report (and I don't see an easy solution, since that's otherwise
very much desired).


Open idea:
Add a Content-integrity http header akin to Content-MD5 but with a named
information uri. Checking that a GET of the url returns that header will
help the schema to still work avoiding origin confusion problems (and
obviously any body with such header MUST match the hash).
It would be useful to add an equivalent If-None-Match (or even extend
those headers for dealing with ni uris in addition to etags) but I'm not
convinced about that.


Cheers

Received on Monday, 15 September 2014 23:43:08 UTC