- From: Justin Lebar <justin.lebar@gmail.com>
- Date: Mon, 9 Aug 2010 22:44:18 -0700
The files I used for the rough benchmarks are available in a tarball at [1]. Live pages are at [2] and [3]. [1] http://people.mozilla.org/~jlebar/respkg/test/benchmark_files.tgz [2] http://people.mozilla.org/~jlebar/respkg/test/test-pkg.html [3] http://people.mozilla.org/~jlebar/respkg/test/test-nopkg.html -Justin On Mon, Aug 9, 2010 at 1:40 PM, Justin Lebar <justin.lebar at gmail.com> wrote: >> Can you provide the content of the page which you used in your whitepaper? >> (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820) > > I'll post this to the bug when I get home tonight. ?But your comments > are astute -- the page I used is a pretty bad benchmark for a variety > of reasons. ?It sounds like you probably could hack up a much better > one. > >> ? ?a) Looks like pages were loaded exactly once, as per your notes? ?How >> hard is it to run the tests long enough to get to a 95% confidence interval? > > Since I was running on a simulated network with no random parameters > (e.g. no packet loss), there was very little variance in load time > across runs. > >> ? ?d) What did you do about subdomains in the test? ?I assume your test >> loaded from one subdomain? > > That's correct. > >> I'm betting time-to-paint goes through the roof with resource bundles:-) > > It does right now because we don't support incremental extraction, > which is why I didn't bother measuring time-to-paint. ?The hope is > that with incremental extraction, we won't take too much of a hit. > > -Justin > > On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe <mike at belshe.com> wrote: >> Justin - >> Can you provide the content of the page which you used in your whitepaper? >> (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820) >> I have a few concerns about the benchmark: >> ?? a) Looks like pages were loaded exactly once, as per your notes? ?How >> hard is it to run the tests long enough to get to a 95% confidence interval? >> ?? b) As you note in the report, slow start will kill you. ?I've verified >> this so many times it makes me sick. ?If you try more combinations, I >> believe you'll see this. >> ?? c) The 1.3MB of subresources in a single bundle seems unrealistic to me. >> ?On one hand you say that its similar to CNN, but note that CNN has >> JS/CSS/images, not just thumbnails like your test. ?Further, note that CNN >> pulls these resources from multiple domains; combining them into one domain >> may work, but certainly makes the test content very different from CNN. ?So >> the claim that it is somehow representative seems incorrect. ? For more >> accurate data on what websites look like, >> see?http://code.google.com/speed/articles/web-metrics.html >> ?? d) What did you do about subdomains in the test? ?I assume your test >> loaded from one subdomain? >> ?? e) There is more to a browser than page-load-time. ?Time-to-first-paint >> is critical as well. ?For instance, in WebKit and Chrome, we have specific >> heuristics which optimize for time-to-render instead of total page load. >> ?CNN is always cited as a "bad page", but it's really not - it just has a >> lot of content, both below and above the fold. ?When the user can interact >> with the page successfully, the user is happy. ?In other words, I know I can >> make webkit's PLT much faster by removing a couple of throttles. ?But I also >> know that doing so worsens the user experience by delaying the time to first >> paint. ?So - is it possible to measure both times? ?I'm betting >> time-to-paint goes through the roof with resource bundles:-) >> If you provide the content, I'll try to?run some tests. ?It will take a few >> days. >> Mike >> >> On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar <justin.lebar at gmail.com> wrote: >>> >>> On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor <Simetrical+w3c at gmail.com> >>> wrote: >>> > If UAs can assume that files with the same path >>> > are the same regardless of whether they came from a resource package >>> > or which, and they have all but a couple of the files cached, they >>> > could request those directly instead of from the resource package, >>> > even if a resource package is specified. >>> >>> These kinds of heuristics are far beyond the scope of resource >>> packages as we're planning to implement them. ?Again, I think this >>> type of behavior is the domain of a large change to the networking >>> stack, such as SPDY, not a small hack like resource packages. >>> >>> -Justin >>> >>> On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor <Simetrical+w3c at gmail.com> >>> wrote: >>> > On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar <justin.lebar at gmail.com> >>> > wrote: >>> >> I think this is a fair point. ?But I'd suggest we consider the >>> >> following: >>> >> >>> >> * It might be confusing for resources from a resource package to show >>> >> up on a page which doesn't "opt-in" to resource packages in general or >>> >> to that specific resource package. >>> > >>> > Only if the resource package contains a different file from the real >>> > one. ?I suggest we treat this as a pathological case and accept that >>> > it will be broken and confusing -- or at least we consider how many >>> > extra optimizations we could make if we did accept that, before >>> > deciding whether the extra performance is worth the confusion. >>> > >>> >> * There's no easy way to opt out of this behavior. ?That is, if I >>> >> explicitly *don't* want to load content cached from a resource >>> >> package, I have to name that content differently. >>> > >>> > Why would you want that, if the files are the same anyway? >>> > >>> >> * The avatars-on-a-forum use case is less convincing the more I think >>> >> about it. ?Certainly you'd want each page which displays many avatars >>> >> to package up all the avatars into a single package. ?So you wouldn't >>> >> benefit from the suggested caching changes on those pages. >>> > >>> > I don't see why not. ?If UAs can assume that files with the same path >>> > are the same regardless of whether they came from a resource package >>> > or which, and they have all but a couple of the files cached, they >>> > could request those directly instead of from the resource package, >>> > even if a resource package is specified. ?So if twenty different >>> > people post on the page, and you've been browsing for a while and have >>> > eighteen of their avatars (this will be common, a handful of people >>> > tend to account for most posts in a given forum): >>> > >>> > 1) With no resource packages, you fetch two separate avatars (but on >>> > earlier page views you suffered). >>> > >>> > 2) With resource packages as you suggest, you fetch a whole resource >>> > package, 90% of which you don't need. ?In fact, you have to fetch a >>> > resource package even if you have 100% of the avatars on the page! ?No >>> > two pages will be likely to have the same resource package, so you >>> > can't share cache at all. >>> > >>> > 3) With resource packages as I suggest, you fetch only two separate >>> > avatars, *and* you got the benefits of resource packages on earlier >>> > pages. ?The UA gets to guess whether using resource packages would be >>> > a win on a case-by-case basis, so in particular, it should be able to >>> > perform strictly better than either (1) or (2), given decent >>> > heuristics. ?E.g., the heuristic "fetch the resource package if I need >>> > at least two files, fetch the file if I only need one" will perform >>> > better than either (1) or (2) in any reasonable circumstance. >>> > >>> > I think this sort of situation will be fairly common. ?Has anyone >>> > looked at a bunch of different types of web pages and done a breakdown >>> > of how many assets they have, and how they're reused across pages? ?If >>> > we're talking about assets that are used only on one page (image >>> > search) or all pages (logos, shared scripts), your approach works >>> > fine, but not if they're used on a random mix of pages. ?I think a lot >>> > of files will wind up being used on only particular subsets of pages. >>> > >>> >> In general, I think we need something like SPDY to really address the >>> >> problem of duplicated downloads. ?I don't think resource packages can >>> >> fix it with any caching policy. >>> > >>> > Certainly there are limits to what resource packages can do, but we >>> > can wind up closer to the limits or farther from them depending on the >>> > implementation details. >>> > >> >> >
Received on Monday, 9 August 2010 22:44:18 UTC