- From: Nathanael D. Jones <nathanael.jones@gmail.com>
- Date: Sun, 11 Aug 2013 21:58:55 -0400
- To: François REMY <francois.remy.dev@outlook.com>
- Cc: "Patrick H. Lauke" <redux@splintered.co.uk>, HTML WG LIST <public-html@w3.org>, Glenn Adams <glenn@skynav.com>
- Message-ID: <CAG3DbfX18d5DxXUkqQ2Orubu+EBMbSbbMsRK9hhsobtVJyADuA@mail.gmail.com>
I guess I wasn't clear enough. The browser NEVER uses or trusts the provided hash for WRITING to the cache; only for READING from the cache. A resource is written to the cache using the hash *locally calculated by the BROWSER *after downloading the resource from the URI. This eliminates the possibility of cache poisoning and all your points predicated upon that. On Sun, Aug 11, 2013 at 8:27 PM, François REMY < francois.remy.dev@outlook.com> wrote: > > You've evidently misinterpreted my > > proposal. I'm not suggesting any server-side or HTTP-level behavior > > I know. This is what I'm proposing. What you propose is basically to put > an ETAG in the HTML file, outside of the control of the server. What I > propose is to create a sort of "shared etag" so that the browser can send > some "etag of a file he already know" to avoid downloading a file that may > be the same. In the end, it’s up to the server to decide. > > If the reason you think the hash should be included inline is that it > avoids an RTT for the hash discovery, I disagree because HTTP/2.0 will > allow a website to reply to requests a browser didn't send as part of its > initial response, so the server could reply when a client connect with the > full HEAD of some script files he expect the client to have already, so > that the browser do not have to issue that request if the caching > conditions of if the hash matches [*]. > > This system does not require you to modify your website in any way, it's a > purely server implementation trick. It also don't force you to modify your > static HTML files when you update some library to a new version. > > > > [*] You lose your RTT only if you store your script on another server than > yours (because your server cannot provide the HEAD responses with the > hashes in lieu of another server, that would be a security issue). However, > including scripts from another domain is already a bad practice because > you're giving away your security to some other website. The whole point of > this proposal is that you may spare download of the resource even if you > decide that for security or performance (dns and already open connection) > reason you do not want to rely on someone else CDN to hope to avoid the > download (now people do because some CDNs are popular so the time not spent > downloading the resource because it's shared across websites is higher than > the time lost for the few people not having the resource already and having > to open a new connection to the said CDN while loading the page). > > An example of this issue is people using Google Web Font store to download > their fonts because it means that everybody that use Google CDN and the > same font only need to download once. My proposal allows you to store the > file on your website but still get the benefits if anyone else use the same > file but hosted on its own server (or even the Google CDN) because the hash > will work alike the etag (your server will reply with a 304 under HTTP1, > send the hash as an aside reply under HTTP2). > > > > > whatsoever - the hash only appears once, in the HTML. > > It is never transmitted again. > > I can't help but repeat... > > I didn't say SHA-2 was insecure. My proposal use it as well. My issue with > this proposal is the fact it's not the role of HTML to deal with > transport-layer issues. > > My real point is: no browser should ever load a resource without asking > the server that hosts it the authorization to do so, and the metadata > (CORS,CSP,...) under which that server operates. I don't say SHA2 is > insecure, I say loading a resource only based on a claim (an attribute on > the <script> tag) which is potentially sent by someone else than the > website which hosts the resource and subject to XSS attacks is a bad idea. > Additionally, transmitting the hash of every file as part of the url (or an > attribute found in the HTML, or whatever) is a bad idea, too. The > identifier of a resource should never include content-based information > because there's always a risk for this information to be out-of-sync [EDIT] > and partial in the sense it doesn't cover all the http headers the server > may want to send [/EDIT]. > > My high-order belief is that this "super-cache" feature should be built in > the HTTP layer and reuse the HTTP caching semantics and should not be > defined at the HTML level because it deals with a transport-layer issue. > > > > > Until you can show mathematical evidence otherwise, > > we can safely assume that collisions for SHA-2 (512-bit) > > cannot presently be found through chance or malicious > > intent. There doesn't seem to be much hope for this > > happening anytime soon, either. > > The security issue of your proposal is not the SHA-2 hash (yet) but the > fact that you load a resource without asking any server the permission to > do so! > > > > > Provide specific attack scenarios for your hand-wavy > > references to XSS, or stop generating FUD. Assuming > > a secure hash function, I can't see how this could > > possibly be useful for XSS. > > Someone could poison the cache for a file by mapping it to another very > well-known one. > > <script src="http://mybank/secure.js" hash="jquery-hash" /> > > Or someone may expect that some file does not load if some security > restrictions are not met, and those restrictions checked by the server > could be bypassed by attributing an hash to the file when it's loaded the > first time (the conditions are met) and the reusing that hash the next > times when the conditions are not met. > > On the secure website, with the hash changing every day: > <script src="http://mybank/check-security.php" onload="doSomething()" > hash="hash-for-people-having-already-passed-the-test-today" /> > > The attacker use a computer that passes the tests, download the file, > create a malicious image/script/iframe pointing to the file and associating > the right hash to it > <script src="http://attackersite/check-security.php" > hash="hash-for-people-having-already-passed-the-test-today" /> > > Now people visiting the site will have the security check disabled on > the bank site because there exists some other site which already filled the > super-cache with the right file, not doing any security check. This is > clearly a case where the bank site owner did a mistake (he should have used > a normal expire header and no hash) but it's not easy for him to understand > that. > > Or if someone want to tricks a webpage, it could create an image whose src > points to one image (scripts will recognize it as an image coming from the > right server because the src will be right) but have an hash pointing to > some other image the attacker downloaded before (ie the image will contain > something that's different from that its src attribute says it does). > > > > Again, I continue to claim that if there's a reference to a script, an > image, or whatever hosted on some server, the server MUST give his > authorization before a resource is loaded, even if there's some > identification process going on. > > > > > It should be noted that the hash of a resource and the resource > > itself should be protected with equal vigor; the hash contains > > 'part of the resource', and can be used to reconstruct the entire > > resource (through a directed attack at shared visitors). > > > > This may not be obvious at first glance. > > I was about to say the exact same thing. It's not obvious, people will > make mistakes. An hash is supposedly something you can leak. Examples: we > don't store passwords in DB but (salted) hashes so that if someone takes > over the DB, it can't recover the passwords. Here you're using hashes to > recover the resource, people will definitely not like that. >
Received on Monday, 12 August 2013 01:59:43 UTC