Re: "Packing on the Web" -- performance use cases / implications

On Thu, 15 Jan 2015, Ilya Grigorik wrote:

> A bit of handwaving on pros/cons of a ~manifest like approach:
>
> + Single URL to represent a bundle of resources (sharing, embedding, etc)
> + Fetching is uncoupled from manifest: granular caching, revalidation,
> updates, prioritization.. all of my earlier issues are addressed.
> + You can make integrity assertions about the manifest and each subresource
> within it (via SRI)
> + No complications or competition with HTTP/2: you get the best of both
> worlds
> + Can be enhanced with http/2 push where request for manifest becomes the
> parent stream against which (same origin) subresources are pushed

Well, HTTP2 is using a dependency graph now, how about this manifest be a 
serialized version of it? It could help in the case of an HTTP/1.1 client 
talking to an HTTP/1.1->HTTP2 gateway/cache to do prioritization.

But one use case of the package format was to be able to send the whole 
package instead of the first URL. In your proposal you still have to 
generate all requests.

> + Works fine with HTTP/1 but subject to regular HTTP/1 HoL concerns: use
> sharding, etc.. all existing http/1 optimizations apply.
> + Compatible out of the gate with old servers, new servers can do smart
> things with it (e.g. CDN can fetch and preload assets to edge)
>
> Also, Alex I asked you this earlier, but I don't recall why we ruled it
> out... Wouldn't rel=import address this? E.g...
>
> <link rel="import" href="/lib/brand.pack">
>
>> --- contents of brand.pack ---<
> <link rel=preload as=image href=logo.png integrity={fingerprint} />
> <link rel=preload as=stylesheet href=style.css integrity={fingerprint} />
> <link rel=preload as=javascript href=module-thing.js />
> ...
> <link rel=preload as=javascript href=https://my.cdn.com/framework.js />
>
> <script>
> if (someDynamicClientConditionIsMet()) {
>  var res = document.createElement("link");
>  res.rel = "preload";
>  res.href = "/custom-thing";
>  document.head.appendChild(res);
> }
> </script>
>> ------<
>
> It feels like we already have all the necessary components to compose the
> desired behaviors... and more (e.g. dynamic fetches in above example).
>
> ig
>
> On Thu, Jan 15, 2015 at 5:20 PM, Alex Russell <slightlyoff@google.com>
> wrote:
>
>> That had occurred to me too. Maybe once major impls rip out AppCache
>> support....
>>
>> On Thu, Jan 15, 2015 at 5:05 PM, Travis Leithead <
>> travis.leithead@microsoft.com> wrote:
>>
>>>  Reminds me of:
>>>
>>> <html manifest=?/lib/manifest?>
>>>
>>>
>>>
>>> ?in that you get a list of resources to cache for the application. Not
>>> quite the same, but conceptually similar. Perhaps we could avoid creating a
>>> new separate concept, and reuse/extend this manifest? I?m sure someone else
>>> has probably already considered this?apologies for coming in late to the
>>> discussion.
>>>
>>>
>>>
>>> *From:* Alex Russell [mailto:slightlyoff@google.com]
>>> *Sent:* Thursday, January 15, 2015 3:47 PM
>>> *To:* Ilya Grigorik
>>> *Cc:* Mark Nottingham; Yoav Weiss; public-web-perf; www-tag@w3.org List;
>>> Jeni Tennison
>>> *Subject:* Re: "Packing on the Web" -- performance use cases /
>>> implications
>>>
>>>
>>>
>>> Ilya and I had a chance to chat this afternoon and he had a brilliant
>>> idea: what if there were a preamble section that allowed the package to
>>> simply be a hint to UA to start fetching a list of (not-included) resources?
>>>
>>>
>>>
>>> This would let you invoke one with:
>>>
>>>
>>>
>>>     <link rel="package" href="/lib/brand.pack">
>>>
>>>
>>>
>>> Note the lack of a "scope" attribute.
>>>
>>>
>>>
>>> The contents of "brand.back" wouldn't be a resources, but instead is a
>>> list of URLs to request. This would let a site reduce the number (and
>>> repetition) of <link rel="prefetch"> tags in the first (crucial bytes)..
>>> This could be done by using the preamble section
>>> <http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html> of the package
>>> to include a structured list of URLs to preflight.
>>>
>>>
>>>
>>> Thoughts?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jan 14, 2015 at 2:19 PM, Ilya Grigorik <igrigorik@google.com>
>>> wrote:
>>>
>>>   On Tue, Jan 13, 2015 at 3:35 PM, Alex Russell <slightlyoff@google.com>
>>> wrote:
>>>
>>>   On Tue, Jan 13, 2015 at 2:18 PM, Ilya Grigorik <igrigorik@google.com>
>>> wrote:
>>>
>>>  On Wed, Jan 7, 2015 at 8:25 AM, Mark Nottingham <mnotting@akamai.com
>>>> wrote:
>>>
>>>  This doc:
>>>   http://w3ctag.github.io/packaging-on-the-web/
>>> says a number of things that about how a Web packaging format could
>>> improve Web performance; e.g., for cache population, bundling packages to
>>> distribute to servers, etc.
>>>
>>>
>>>
>>> tl;dr: I think its introducing perf anti-patterns and is going against
>>> the general direction we want developers to head. Transport optimization
>>> should be left at transport layer and we already have much better
>>> (available today!) solutions for this.
>>>
>>>
>>>
>>> I'm going to leave comments inline below, but I think your read of this
>>> is far too harsh, forecloses meaningful opportunities for developers and
>>> UAs, and in general isn't trying to be as collaborative as I think those of
>>> us who have worked on the design would hope for.
>>>
>>>
>>>
>>> Apologies if it came across as overly negative. Mark asked for
>>> perf-related feedback and that's what I'm trying to provide.. much of which
>>> I've shared previously in other threads and chats. I do think there are
>>> interesting use cases here that are worth resolving, but I'm just not
>>> convinced that a new package streaming format is the right approach: lots
>>> of potential pitfalls, duplicated functionality, etc. My comments shouldn't
>>> rule out use cases which are not perf sensitive, but I do think it's worth
>>> considering the perf implications for cases where it may end up being
>>> (ab)used.
>>>
>>>
>>>
>>>    ---- some notes as I'm reading through the latest draft:
>>>
>>>
>>>
>>> (a) It's not clear to me how packages are updated after the initial
>>> fetch. In 2.1.1. you download the .pack with a CSS file but then request
>>> the CSS independently later... But what about the .pack? Wouldn't the
>>> browser revalidate it, detect that the package has changed (since CSS has
>>> been updated), and be forced to download the entire bundle once over? Now
>>> we have duplicate downloads on top of unnecessary fetches.
>>>
>>>
>>>
>>> The presence of the package file is a hint. It's designed to be
>>> compatible with legacy UAs which may issue requests for each resource,
>>> which the UA is *absolutely allowed to do in this case*. It can implement
>>> whatever heuristic or fetch is best.
>>>
>>>
>>>
>>> That doesn't address my question though. How does my app rev the package
>>> and take advantage of granular downloads, without incurring unnecessary
>>> fetches and duplicate bytes? I'm with you on heuristics.. I guess I'm
>>> asking for some documented examples of how this could/should work:
>>>
>>>
>>>
>>> a) disregard packages: what we have today.. granular downloads and
>>> caching, but some queuing limitations with http/1.
>>>
>>> b) always fetch packages: you incur unnecessary bytes and fetches
>>> whenever a single resource is updated.
>>>
>>> c) how do I combine packages and granular updates? Wouldn't you always
>>> incur unnecessary and/or duplicate downloads?
>>>
>>>
>>>
>>>    In general, all bundling strategies suffer from one huge flaw: a
>>> single byte update in any of its subresources forces a full fetch of the
>>> entire file.
>>>
>>>  Assuming, as you mistakenly have, that fetching the package is the only
>>> way to address the resource.
>>>
>>>
>>>
>>> I didn't assume that it is, I understand that the proposed method is
>>> "backwards compatible" and that UA can request granular updates for
>>> updating resources.. but this takes us back to the previous point -- is
>>> this only useful for the initial fetch? I'd love to see a good walkthrough
>>> of how the initial fetch + granular update cycle would work here.
>>>
>>>
>>>
>>>    (b) Packages introduce another HoL bottleneck: spec talks about
>>> ordering recommendations, but there is still a strict ordering during
>>> delivery (e.g. if the package is not a static resource then a single slow
>>> resource blocks delivery of all resources behind it).
>>>
>>>
>>>
>>> Is the critique -- seriously -- that doing dumb things is dumb?
>>>
>>>
>>>
>>> I'm questioning why we would be enabling features that have all of the
>>> highlighted pitfalls, while we have an existing solution that doesn't
>>> suffer from the same issues. That, and I'm wondering if we can meet the
>>> desired use cases without introducing these gotchas -- e.g. do we need the
>>> streaming package at all vs. some form of manifest~like thing that defers
>>> fetching optimizations to the transport layer.
>>>
>>>
>>>
>>>    (c) Packages break granular prioritization:
>>>
>>>
>>>
>>> Only assuming that your server doesn't do something smarter.
>>>
>>>
>>>
>>> One of the great things about these packages is that they can *cooperate* with
>>> HTTP/2: you can pre-fill caches with granular resources and entirely avoid
>>> serving packages to clients that are savvy to them.
>>>
>>>
>>>
>>> Can you elaborate on the full end-to-end flow of how this would work:
>>> initial package fetch for prefill, followed by...?
>>>
>>>
>>>
>>> Would the UA unpack all the resources from a package into individual
>>> cache entries? Does it retain the package file itself? What's the process
>>> for revalidating a package? Or is that a moot question given that
>>> everything is unpacked and the package itself is not retained? But then,
>>> how does the UA know when to refetch the package?
>>>
>>>
>>>
>>> As an aside: cache prefill is definitely an interesting use case and
>>> comes with lots of gotchas... With http/2 we have the push strategy and the
>>> client has ability to disable it entirely; opt-out from specific pushed
>>> resources (send a RST on any stream - e.g. already in cache); control how
>>> much is pushed (via initial flow window)... because we had a lot of
>>> concerns over servers pushing a lot of unnecessary content and eating up
>>> users BW/data. With packages the UA can only make a binary decision of
>>> fetch or no fetch, which is a lot less flexible.
>>>
>>>
>>>
>>>   Your server can even consume packages as an ordered set of resources
>>> to prioritize the sending of (and respond with no-op packages to clients
>>> for which the package wouldn't be useful).
>>>
>>>
>>>
>>> Does this offer anything extra over simply delivering individual
>>> resources with granular caching and prioritization available in http/2?
>>>
>>>
>>>
>>> From what I can tell, the primary feature is that the client doesn't
>>> necessarily know what all the resources it may need to download are... For
>>> which we have two solutions: http/2 push, or we teach the client to learn
>>> what those resource URIs are and initiate the requests from the client
>>> (regardless of http version).
>>>
>>>
>>>
>>> ig
>>>
>>>
>>>
>>
>>
>

-- 
Baroula que barouleras, au tiƩu toujou t'entourneras.

         ~~Yves

Received on Wednesday, 21 January 2015 10:07:59 UTC