Re: Call for Adoption: Cache Digests for HTTP/2

Hi Kazuho and Amos,

Thank you for taking some time to take a look to this report!

Chrome is my default browser but I couldn't find an efficient way to get an
inventory of its contents. So these numbers represent a day of typical
browsing with Firefox. They were obtained by repeatedly scanning the
"about:cache" page reported by firefox. Firefox uses two caches, an
in-memory one and an in-disk one, and contents move between the two
according to certain heuristic.  Therefore, to get a good estimate it is
better to scan the "about:cache" contents multiple times and unify the
entries before counting them. All that procedure is at this file:
https://github.com/shimmercat/internet_fonden_report/blob/master/source/AnalyzingAFirefoxCache.ipynb
.

Addressing Kazuho's question:

- Yes, they span multiple TCP connections. The main question for us when
doing this little study was if cache digests were useful in a direct
browser-server scenario, with a typical user that would visit a site and
return a few times.

- Concretely addressing your question, Amos, these numbers come from the
inventory that firefox reports on "about:cache", and as far as I know, it
comes from what Firefox collects across multiple sessions.

You can find the entire procedure in the github repository, particularly
the samples were scanned at input 24 in this notebook:
https://github.com/shimmercat/internet_fonden_report/blob/master/source/AnalyzingAFirefoxCache.ipynb
. The data itself is also in the same repository, here :
https://github.com/shimmercat/internet_fonden_report/blob/master/data.tar.gz


We do reckon that covering more data would be helpful, but short of
creating an addon and having some Firefox users to install it, we haven't
come with any good ideas.

One thing we learned is that cache contents vary a lot from site to site,
that's why I think that we need more per-site mechanisms to control digest
contents and their extent. In particular we recommend some scoping
mechanism (besides using different origins), and a way to disable digests
altogether, or even better, make digests opt-in for sites.


Bests,

./Alcides.






On Tue, Jul 12, 2016 at 10:02 AM, Amos Jeffries <squid3@treenet.co.nz>
wrote:

> On 8/07/2016 1:35 a.m., Alcides Viamontes E wrote:
> > Our report on cache digests for HTTP/2, partially funded by the Swedish
> > Internet Development fund:
> >
> > https://if-report.shimmercat.com/dirhtml/
> >
> > The main content of the report are some numbers regarding per-site cache
> > size, which can be used to estimate the size of the cache digest.  Hope
> it
> > can be of some use.
>
> Thank you.
>
> Can you clarify for me whether these numbers are gathered from object
> counts on a single page of the visited sites, from an entire session
> browsing around those sites, or from a scan of all objects the site
> produces that are cacheable?
>
> I am a little worried that they may be from the first two use-case and
> thus under-representing how much a shared cache might accumulate per-site.
>
> Amos
>
>


-- 
Alcides Viamontes
www.shimmercat.com

Received on Tuesday, 12 July 2016 08:28:11 UTC