Chrome is my default browser but I couldn't find an efficient way to get an
inventory of its contents. So these numbers represent a day of typical
browsing with Firefox. They were obtained by repeatedly scanning the
"about:cache" page reported by firefox. Firefox uses two caches, an
in-memory one and an in-disk one, and contents move between the two
according to certain heuristic.  Therefore, to get a good estimate it is
better to scan the "about:cache" contents multiple times and unify the
entries before counting them. All that procedure is at this file:

Addressing Kazuho's question:

- Yes, they span multiple TCP connections. The main question for us when
doing this little study was if cache digests were useful in a direct
browser-server scenario, with a typical user that would visit a site and
return a few times.

- Concretely addressing your question, Amos, these numbers come from the
inventory that firefox reports on "about:cache", and as far as I know, it
comes from what Firefox collects across multiple sessions.

You can find the entire procedure in the github repository, particularly
the samples were scanned at input 24 in this notebook:
. The data itself is also in the same repository, here :

We do reckon that covering more data would be helpful, but short of
creating an addon and having some Firefox users to install it, we haven't
come with any good ideas.

One thing we learned is that cache contents vary a lot from site to site,
that's why I think that we need more per-site mechanisms to control digest
contents and their extent. In particular we recommend some scoping
mechanism (besides using different origins), and a way to disable digests
altogether, or even better, make digests opt-in for sites.



