Re: [fetch] Cache state: partial content (#38)

Some IRC discussion of what we do in gecko's http cache:

12:13 PM <valentin> mayhemer: bkelly asked: do you know if our http cache does anything special for range requests?  are they automatically combined together or anything?
12:14 PM <mayhemer> we only support caching partial content in range <1-N> where N is anything between 1 and content-length
12:14 PM <mayhemer> when we have sych a partial entry (for N < content-length)
12:14 PM <mayhemer> we do an If-Range request and merge, invisibly to the consumer of nsIChannel
12:15 PM <mayhemer> (we merge on 206)
12:15 PM <mayhemer> (we refetch on 200)
12:15 PM <mayhemer> (we report any server error otherwise)
12:15 PM <mayhemer> bkelly: ^^
12:16 PM <mayhemer> note that not all content is resumable so in few cases we throw any partial cached data as soon as the interruption occurs
12:17 PM <bkelly> mayhemer: thanks!
12:50 PM  → sworkman and jduell joined  
1:01 PM <annevk> mayhemer: I guess we don't merge if the Content-Length doesn't match or they're not equally fresh?
1:01 PM <mayhemer> annevk: :) we send out conditinal a header, so it's either an entity match or not
1:01 PM <mayhemer> nothing so primitive as Content-Length matching!
1:02 PM <mayhemer> the server responses with either 206 and sends what we don't have (or what we have asked for with Range: header)
1:02 PM <mayhemer> or the server responses with 200 and a whole new content and ETag
1:04 PM <mayhemer> we also match via Last-Modified if available (ETag preferred)
1:04 PM <mayhemer> no ETag and no Last-Modified (+some more conditions) and we don't cache at all when interrupted (there is no way to do a conditinal If-Range request anyway)
1:05 PM <mayhemer> annevk: ^ does it explain?
1:05 PM <annevk> mayhemer: that helps, I was also interested in the refetch clause earlier
1:06 PM <mayhemer> refetch clause? I'm not familiar with that term
1:06 PM <annevk> "(we refetch on 200)"
1:06 PM <mayhemer> aha!
1:06 PM <mayhemer> on 200 we just throw away the content previously cached
1:07 PM <annevk> but we don't hit the server again?
1:07 PM <mayhemer> and just replace with a whole new content (from the start)
1:07 PM <mayhemer> hm?
1:07 PM <annevk> refetch sounds like we'd do another network request (called fetching elsewhere)
1:07 PM → bholley joined (bholley@moz-cfhap5.mtv2.mozilla.com)
1:07 PM <mayhemer> ok, so, we can have 3 states of the cache (very generally):
1:07 PM <mayhemer> 1. no cached entry
1:08 PM ⇐ bholley quit (bholley@moz-cfhap5.mtv2.mozilla.com) Quit: Textual IRC Client: www.textualapp.com
1:08 PM <mayhemer> 2. only a partial entry <0 - N>, N < content-length
1:08 PM <mayhemer> 3. full entry
1:08 PM <mayhemer> in case 2 we definitely have ETag or Last-Modified
1:08 PM <mayhemer> in case 3 we may or may not to, for simplicity assume we do
1:08 PM <mayhemer> so, request is made
1:08 PM <mayhemer> we look at the cache
1:09 PM <mayhemer> in case 1 we just go to the server with an ordinary request, only valid response may be 200 OK with the full content entity
1:10 PM <mayhemer> in case 2 we do a request with If-Range header that is filled with ETag (or Last-Modified when ETag is not avail) and a Range header with NNN-XXX where NNN is the first byte we want and NNN is the content length (I think..)
1:10 PM <mayhemer> a valid server response may be only 206 where the part we are missing is sent to us
1:11 PM <mayhemer> or a 200 where a whole new complete reponse content entity is sent to us (we replace any previously cached data)
1:11 PM <mayhemer> in case 3, it's similar, we send a request with If-None-Match: Etag
1:12 PM <mayhemer> and get only one of 304 - we server the cached content - or 200 we replace the cached content and download a new resource content entity
1:12 PM <mayhemer> annevk: ^ makes sense?
1:12 PM <annevk> yeah, so we only cache 200 / 206?
1:12 PM <annevk> do we update headers for 304?
1:13 PM <mayhemer> yep, we definitely update headers for 304 and 206 responses
1:13 PM <annevk> and, was part of your simplification above that we actually do have support for multiple independent ranges?
1:13 PM <mayhemer> we don't :(
1:13 PM <annevk> but we might some day?
1:13 PM <mayhemer> there were some requirements but nsHttpChannel cannot work with it right now
1:13 PM <annevk> okay
1:13 PM <mayhemer> we may, one day
1:13 PM <mayhemer> the cache itself can work with it
1:14 PM <annevk> to some extent all this affects fetch() as you may know and this has been hugely helpful
1:14 PM <mayhemer> but levels above (channels) cannot handle it
1:14 PM <mayhemer> annevk: always glad to help :)

---
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/issues/38#issuecomment-125277789

Received on Monday, 27 July 2015 17:21:33 UTC