RE: "Packing on the Web" -- performance use cases / implications

Reminds me of:
<html manifest=”/lib/manifest”>

…in that you get a list of resources to cache for the application. Not quite the same, but conceptually similar. Perhaps we could avoid creating a new separate concept, and reuse/extend this manifest? I’m sure someone else has probably already considered this—apologies for coming in late to the discussion.

From: Alex Russell [mailto:slightlyoff@google.com]
Sent: Thursday, January 15, 2015 3:47 PM
To: Ilya Grigorik
Cc: Mark Nottingham; Yoav Weiss; public-web-perf; www-tag@w3.org List; Jeni Tennison
Subject: Re: "Packing on the Web" -- performance use cases / implications

Ilya and I had a chance to chat this afternoon and he had a brilliant idea: what if there were a preamble section that allowed the package to simply be a hint to UA to start fetching a list of (not-included) resources?

This would let you invoke one with:

    <link rel="package" href="/lib/brand.pack">

Note the lack of a "scope" attribute.

The contents of "brand.back" wouldn't be a resources, but instead is a list of URLs to request. This would let a site reduce the number (and repetition) of <link rel="prefetch"> tags in the first (crucial bytes). This could be done by using the preamble section<http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html> of the package to include a structured list of URLs to preflight.

Thoughts?




On Wed, Jan 14, 2015 at 2:19 PM, Ilya Grigorik <igrigorik@google.com<mailto:igrigorik@google.com>> wrote:
On Tue, Jan 13, 2015 at 3:35 PM, Alex Russell <slightlyoff@google.com<mailto:slightlyoff@google.com>> wrote:
On Tue, Jan 13, 2015 at 2:18 PM, Ilya Grigorik <igrigorik@google.com<mailto:igrigorik@google.com>> wrote:
On Wed, Jan 7, 2015 at 8:25 AM, Mark Nottingham <mnotting@akamai.com<mailto:mnotting@akamai.com>> wrote:
This doc:
  http://w3ctag.github.io/packaging-on-the-web/

says a number of things that about how a Web packaging format could improve Web performance; e.g., for cache population, bundling packages to distribute to servers, etc.

tl;dr: I think its introducing perf anti-patterns and is going against the general direction we want developers to head. Transport optimization should be left at transport layer and we already have much better (available today!) solutions for this.

I'm going to leave comments inline below, but I think your read of this is far too harsh, forecloses meaningful opportunities for developers and UAs, and in general isn't trying to be as collaborative as I think those of us who have worked on the design would hope for.

Apologies if it came across as overly negative. Mark asked for perf-related feedback and that's what I'm trying to provide.. much of which I've shared previously in other threads and chats. I do think there are interesting use cases here that are worth resolving, but I'm just not convinced that a new package streaming format is the right approach: lots of potential pitfalls, duplicated functionality, etc. My comments shouldn't rule out use cases which are not perf sensitive, but I do think it's worth considering the perf implications for cases where it may end up being (ab)used.

---- some notes as I'm reading through the latest draft:

(a) It's not clear to me how packages are updated after the initial fetch. In 2.1.1. you download the .pack with a CSS file but then request the CSS independently later... But what about the .pack? Wouldn't the browser revalidate it, detect that the package has changed (since CSS has been updated), and be forced to download the entire bundle once over? Now we have duplicate downloads on top of unnecessary fetches.

The presence of the package file is a hint. It's designed to be compatible with legacy UAs which may issue requests for each resource, which the UA is *absolutely allowed to do in this case*. It can implement whatever heuristic or fetch is best.

That doesn't address my question though. How does my app rev the package and take advantage of granular downloads, without incurring unnecessary fetches and duplicate bytes? I'm with you on heuristics.. I guess I'm asking for some documented examples of how this could/should work:

a) disregard packages: what we have today.. granular downloads and caching, but some queuing limitations with http/1.
b) always fetch packages: you incur unnecessary bytes and fetches whenever a single resource is updated.
c) how do I combine packages and granular updates? Wouldn't you always incur unnecessary and/or duplicate downloads?

In general, all bundling strategies suffer from one huge flaw: a single byte update in any of its subresources forces a full fetch of the entire file.
Assuming, as you mistakenly have, that fetching the package is the only way to address the resource.

I didn't assume that it is, I understand that the proposed method is "backwards compatible" and that UA can request granular updates for updating resources.. but this takes us back to the previous point -- is this only useful for the initial fetch? I'd love to see a good walkthrough of how the initial fetch + granular update cycle would work here.

(b) Packages introduce another HoL bottleneck: spec talks about ordering recommendations, but there is still a strict ordering during delivery (e.g. if the package is not a static resource then a single slow resource blocks delivery of all resources behind it).

Is the critique -- seriously -- that doing dumb things is dumb?

I'm questioning why we would be enabling features that have all of the highlighted pitfalls, while we have an existing solution that doesn't suffer from the same issues. That, and I'm wondering if we can meet the desired use cases without introducing these gotchas -- e.g. do we need the streaming package at all vs. some form of manifest~like thing that defers fetching optimizations to the transport layer.

(c) Packages break granular prioritization:

Only assuming that your server doesn't do something smarter.

One of the great things about these packages is that they can cooperate with HTTP/2: you can pre-fill caches with granular resources and entirely avoid serving packages to clients that are savvy to them.

Can you elaborate on the full end-to-end flow of how this would work: initial package fetch for prefill, followed by...?

Would the UA unpack all the resources from a package into individual cache entries? Does it retain the package file itself? What's the process for revalidating a package? Or is that a moot question given that everything is unpacked and the package itself is not retained? But then, how does the UA know when to refetch the package?

As an aside: cache prefill is definitely an interesting use case and comes with lots of gotchas... With http/2 we have the push strategy and the client has ability to disable it entirely; opt-out from specific pushed resources (send a RST on any stream - e.g. already in cache); control how much is pushed (via initial flow window)... because we had a lot of concerns over servers pushing a lot of unnecessary content and eating up users BW/data. With packages the UA can only make a binary decision of fetch or no fetch, which is a lot less flexible.

Your server can even consume packages as an ordered set of resources to prioritize the sending of (and respond with no-op packages to clients for which the package wouldn't be useful).

Does this offer anything extra over simply delivering individual resources with granular caching and prioritization available in http/2?

From what I can tell, the primary feature is that the client doesn't necessarily know what all the resources it may need to download are... For which we have two solutions: http/2 push, or we teach the client to learn what those resource URIs are and initiate the requests from the client (regardless of http version).

ig

Received on Friday, 16 January 2015 01:05:36 UTC