W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2010

[whatwg] HTML resource packages

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Mon, 9 Aug 2010 12:47:35 -0400
Message-ID: <AANLkTinDyHrmrAUE1W+-Ake=ekRkQRSsSR4d8Enhp0fX@mail.gmail.com>
On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar <justin.lebar at gmail.com> wrote:
> I think this is a fair point. ?But I'd suggest we consider the following:
>
> * It might be confusing for resources from a resource package to show
> up on a page which doesn't "opt-in" to resource packages in general or
> to that specific resource package.

Only if the resource package contains a different file from the real
one.  I suggest we treat this as a pathological case and accept that
it will be broken and confusing -- or at least we consider how many
extra optimizations we could make if we did accept that, before
deciding whether the extra performance is worth the confusion.

> * There's no easy way to opt out of this behavior. ?That is, if I
> explicitly *don't* want to load content cached from a resource
> package, I have to name that content differently.

Why would you want that, if the files are the same anyway?

> * The avatars-on-a-forum use case is less convincing the more I think
> about it. ?Certainly you'd want each page which displays many avatars
> to package up all the avatars into a single package. ?So you wouldn't
> benefit from the suggested caching changes on those pages.

I don't see why not.  If UAs can assume that files with the same path
are the same regardless of whether they came from a resource package
or which, and they have all but a couple of the files cached, they
could request those directly instead of from the resource package,
even if a resource package is specified.  So if twenty different
people post on the page, and you've been browsing for a while and have
eighteen of their avatars (this will be common, a handful of people
tend to account for most posts in a given forum):

1) With no resource packages, you fetch two separate avatars (but on
earlier page views you suffered).

2) With resource packages as you suggest, you fetch a whole resource
package, 90% of which you don't need.  In fact, you have to fetch a
resource package even if you have 100% of the avatars on the page!  No
two pages will be likely to have the same resource package, so you
can't share cache at all.

3) With resource packages as I suggest, you fetch only two separate
avatars, *and* you got the benefits of resource packages on earlier
pages.  The UA gets to guess whether using resource packages would be
a win on a case-by-case basis, so in particular, it should be able to
perform strictly better than either (1) or (2), given decent
heuristics.  E.g., the heuristic "fetch the resource package if I need
at least two files, fetch the file if I only need one" will perform
better than either (1) or (2) in any reasonable circumstance.

I think this sort of situation will be fairly common.  Has anyone
looked at a bunch of different types of web pages and done a breakdown
of how many assets they have, and how they're reused across pages?  If
we're talking about assets that are used only on one page (image
search) or all pages (logos, shared scripts), your approach works
fine, but not if they're used on a random mix of pages.  I think a lot
of files will wind up being used on only particular subsets of pages.

> In general, I think we need something like SPDY to really address the
> problem of duplicated downloads. ?I don't think resource packages can
> fix it with any caching policy.

Certainly there are limits to what resource packages can do, but we
can wind up closer to the limits or farther from them depending on the
implementation details.
Received on Monday, 9 August 2010 09:47:35 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:26 UTC