Re: "Packing on the Web" -- performance use cases / implications from Alex Russell on 2015-01-15 (public-web-perf@w3.org from January 2015)

From: Alex Russell <slightlyoff@google.com>
Date: Thu, 15 Jan 2015 15:46:37 -0800
To: Ilya Grigorik <igrigorik@google.com>
Cc: Mark Nottingham <mnotting@akamai.com>, Yoav Weiss <yoav@yoav.ws>, public-web-perf <public-web-perf@w3.org>, "www-tag@w3.org List" <www-tag@w3.org>, Jeni Tennison <jeni@jenitennison.com>
Message-ID: <CANr5HFXgRx+dcjXLL_RN5QMp9S3G5tzB56gXnSa4DhS8Cco1jw@mail.gmail.com>
Ilya and I had a chance to chat this afternoon and he had a brilliant idea:
what if there were a preamble section that allowed the package to simply be
a hint to UA to start fetching a list of (not-included) resources?

This would let you invoke one with:

    <link rel="package" href="/lib/brand.pack">

Note the lack of a "scope" attribute.

The contents of "brand.back" wouldn't be a resources, but instead is a list
of URLs to request. This would let a site reduce the number (and
repetition) of <link rel="prefetch"> tags in the first (crucial bytes).
This could be done by using the preamble section
<http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html> of the package to
include a structured list of URLs to preflight.

Thoughts?




On Wed, Jan 14, 2015 at 2:19 PM, Ilya Grigorik <igrigorik@google.com> wrote:

> On Tue, Jan 13, 2015 at 3:35 PM, Alex Russell <slightlyoff@google.com>
> wrote:
>
>> On Tue, Jan 13, 2015 at 2:18 PM, Ilya Grigorik <igrigorik@google.com>
>> wrote:
>>
>>> On Wed, Jan 7, 2015 at 8:25 AM, Mark Nottingham <mnotting@akamai.com>
>>>  wrote:
>>>
>>>> This doc:
>>>>   http://w3ctag.github.io/packaging-on-the-web/
>>>> says a number of things that about how a Web packaging format could
>>>> improve Web performance; e.g., for cache population, bundling packages to
>>>> distribute to servers, etc.
>>>>
>>>
>>> tl;dr: I think its introducing perf anti-patterns and is going against
>>> the general direction we want developers to head. Transport optimization
>>> should be left at transport layer and we already have much better
>>> (available today!) solutions for this.
>>>
>>
>> I'm going to leave comments inline below, but I think your read of this
>> is far too harsh, forecloses meaningful opportunities for developers and
>> UAs, and in general isn't trying to be as collaborative as I think those of
>> us who have worked on the design would hope for.
>>
>
> Apologies if it came across as overly negative. Mark asked for
> perf-related feedback and that's what I'm trying to provide.. much of which
> I've shared previously in other threads and chats. I do think there are
> interesting use cases here that are worth resolving, but I'm just not
> convinced that a new package streaming format is the right approach: lots
> of potential pitfalls, duplicated functionality, etc. My comments shouldn't
> rule out use cases which are not perf sensitive, but I do think it's worth
> considering the perf implications for cases where it may end up being
> (ab)used.
>
>
>> ---- some notes as I'm reading through the latest draft:
>>>
>>> (a) It's not clear to me how packages are updated after the initial
>>> fetch. In 2.1.1. you download the .pack with a CSS file but then request
>>> the CSS independently later... But what about the .pack? Wouldn't the
>>> browser revalidate it, detect that the package has changed (since CSS has
>>> been updated), and be forced to download the entire bundle once over? Now
>>> we have duplicate downloads on top of unnecessary fetches.
>>>
>>
>> The presence of the package file is a hint. It's designed to be
>> compatible with legacy UAs which may issue requests for each resource,
>> which the UA is *absolutely allowed to do in this case*. It can implement
>> whatever heuristic or fetch is best.
>>
>
> That doesn't address my question though. How does my app rev the package
> and take advantage of granular downloads, without incurring unnecessary
> fetches and duplicate bytes? I'm with you on heuristics.. I guess I'm
> asking for some documented examples of how this could/should work:
>
> a) disregard packages: what we have today.. granular downloads and
> caching, but some queuing limitations with http/1.
> b) always fetch packages: you incur unnecessary bytes and fetches whenever
> a single resource is updated.
> c) how do I combine packages and granular updates? Wouldn't you always
> incur unnecessary and/or duplicate downloads?
>
> In general, all bundling strategies suffer from one huge flaw: a single
>>> byte update in any of its subresources forces a full fetch of the entire
>>> file.
>>>
>> Assuming, as you mistakenly have, that fetching the package is the only
>> way to address the resource.
>>
>
> I didn't assume that it is, I understand that the proposed method is
> "backwards compatible" and that UA can request granular updates for
> updating resources.. but this takes us back to the previous point -- is
> this only useful for the initial fetch? I'd love to see a good walkthrough
> of how the initial fetch + granular update cycle would work here.
>
>
>> (b) Packages introduce another HoL bottleneck: spec talks about ordering
>>> recommendations, but there is still a strict ordering during delivery (e.g.
>>> if the package is not a static resource then a single slow resource blocks
>>> delivery of all resources behind it).
>>>
>>
>> Is the critique -- seriously -- that doing dumb things is dumb?
>>
>
> I'm questioning why we would be enabling features that have all of the
> highlighted pitfalls, while we have an existing solution that doesn't
> suffer from the same issues. That, and I'm wondering if we can meet the
> desired use cases without introducing these gotchas -- e.g. do we need the
> streaming package at all vs. some form of manifest~like thing that defers
> fetching optimizations to the transport layer.
>
>
>> (c) Packages break granular prioritization:
>>>
>>
>> Only assuming that your server doesn't do something smarter.
>>
>> One of the great things about these packages is that they can *cooperate* with
>> HTTP/2: you can pre-fill caches with granular resources and entirely avoid
>> serving packages to clients that are savvy to them.
>>
>
> Can you elaborate on the full end-to-end flow of how this would work:
> initial package fetch for prefill, followed by...?
>
> Would the UA unpack all the resources from a package into individual cache
> entries? Does it retain the package file itself? What's the process for
> revalidating a package? Or is that a moot question given that everything is
> unpacked and the package itself is not retained? But then, how does the UA
> know when to refetch the package?
>
> As an aside: cache prefill is definitely an interesting use case and comes
> with lots of gotchas... With http/2 we have the push strategy and the
> client has ability to disable it entirely; opt-out from specific pushed
> resources (send a RST on any stream - e.g. already in cache); control how
> much is pushed (via initial flow window)... because we had a lot of
> concerns over servers pushing a lot of unnecessary content and eating up
> users BW/data. With packages the UA can only make a binary decision of
> fetch or no fetch, which is a lot less flexible.
>
>
>> Your server can even consume packages as an ordered set of resources to
>> prioritize the sending of (and respond with no-op packages to clients for
>> which the package wouldn't be useful).
>>
>
> Does this offer anything extra over simply delivering individual resources
> with granular caching and prioritization available in http/2?
>
> From what I can tell, the primary feature is that the client doesn't
> necessarily know what all the resources it may need to download are... For
> which we have two solutions: http/2 push, or we teach the client to learn
> what those resource URIs are and initiate the requests from the client
> (regardless of http version).
>
> ig
>
Received on Thursday, 15 January 2015 23:47:36 UTC