[SRI] Proposal for extension: copyof: a reversed "fallbacksrc" attribute. from Willem on 2017-12-01 (public-webappsec@w3.org from December 2017)

From: Willem <w-p@dds.nl>
Date: Fri, 1 Dec 2017 18:38:49 +0100
To: public-webappsec@w3.org
Message-ID: <7e92d942-3e59-d4c4-edab-c5b62b12e2cf@dds.nl>
Dear group,

I’d like to propose an extension to the subresource integrity mechanism. 
I would like to propose a situation where web developers can put popular 
resources on their the servers they are responsible of, while still 
giving browsers the opportunity to read the resource from its cache, 
because the browser is informed that the local resource is actually a 
copy of a popular resource. My proposal is to add a “copyof” attribute 
to the existing subresource integrity mechanism.

In the following text, I use the term "visited site" or "visited server" 
for the site a browser is pointed at by the user, while the term "CDN" 
is used for a popular, public server that is not under control of the 
owners of the visited site, but is referred to by a script or style HTML 
tag on the visited site.

Reason for this proposal.
-------------------------

There is a video explaining the subresource integrity mechanism that 
talks about a possibility of a "fallbacksrc" attribute in case the 
interity check fails or in case the resource is not available on the 
CDN. I want to turn this around: The resource is on the visited server, 
but the browser is allowed to use the resource from its cache if the 
integrity check succeeds. If not, it is downloaded from the visited 
site. The CDN location is then almost reduced to an identifier that 
merely marks the resource as being a public and popular one.

The current subresource mechanism is meant to refer to a file on 
third-party servers (CDNs) and give a web developer the opportunity to 
use that file from the browser's cache and still be confident that the 
remotely hosted (and cached) file is the one he intended.

There are a few drawbacks to this approach:

* The file is still hosted remotely. The remote server could be 
decommissioned or down for maintenance.
* If the integrity check fails, there is no fallback. A failure can 
therefore not be corrected by the browser.
* The CDN could track the visitors to the visited site along the net, 
which comes with ethical issues.

Off course, the advantage is in bandwidth and download speed, as the 
resource does not need to be downloaded at all if it is already in the 
browser's cache.

I want to extend the existing subresource integrity mechanism in a way 
that a web developer can put the resource on the visited site, provide 
both a public location (in the new "copyof" attribute) and the integrity 
check in a link to that resource and allow a visitor’s browser to read 
it from its cache if the integrity check succeeds for a cached file that 
matches the public location. If the integrity check fails or the 
resource is not in the visitor’s cache, it is simply used from the 
visited site. It can now even be put into the browser's cache for the 
public location, with an annotation that the file did not originate from 
the public location. This annotation can now be used for files that 
point to the public location (the CDN) without integrity checks: the 
cached resource should then be refreshed, as the public location is 
probably more reliable than any visited site. This should prevent a 
"cache poisoning" attack by a rogue site. In all other cases, integrity 
checks should decide if the resource is still considered valid or should 
be updated.

So, even if the browser knows where the resource originally can be 
found, the browser need not visit that location. The original location 
merely acts as an identifier for the resource. For security reasons, I 
would advise against reading the resource from the browser cache if the 
link provides a public location but no integrity information.

Usage example:
--------------

<script src=”https://www.some.amazing.site.example/script/jslibrary.js” 
copyof=”https://cdn.popular.library.example/jslibrary.js”
integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC">

Or (because the script is hosted on the visited server):
<script src=”script/jslibrary.js” 
copyof=”https://cdn.popular.library.example/jslibrary.js”
integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC">


Specific actions to be taken:
-----------------------------

<script src=”script/jslibrary.js” 
copyof=”https://cdn.popular.library.example/jslibrary.js”
integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC">

Would put load the resource from the visited site if it was not already 
cashed or failed the integrity check, and put it into the cache for 
"https://cdn.popular.library.example/jslibrary.js", marked as 
"non-originating"

<script src=”https://cdn.popular.library.example/jslibrary.js”
integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC">

would override the cached resource if the integrity check succeeds and 
if the cached resource was marked "non-originating", to prevent cache 
poisoning.

<script src=”https://cdn.popular.library.example/jslibrary.js”>

would also override the cached resource if it was marked 
"non-originating", to prevent cache poisoning.

<script src=”script/jslibrary.js” 
copyof=”https://cdn.popular.library.example/jslibrary.js”>

Would simply ignore the "copyof" attribute, as no integrity information 
was given. The file can be cached for the visited site only, but has no 
effect whatsoever on the caching of any resource that matches 
"https://cdn.popular.library.example/jslibrary.js".

Best regards,

Willem Bogaerts.
Kratz Business Solutions B.V.
Received on Friday, 1 December 2017 17:46:13 UTC