Re: [resource-hints] first spec draft from 陈智昌 on 2014-07-16 (public-web-perf@w3.org from July 2014)

From: 陈智昌 <willchan@chromium.org>
Date: Tue, 15 Jul 2014 17:29:48 -0700
To: Peter Lepeska <bizzbyster@gmail.com>
Cc: Ilya Grigorik <igrigorik@google.com>, "Nottingham, Mark" <mnotting@akamai.com>, "Podjarny, Guy" <gpodjarn@akamai.com>, public-web-perf <public-web-perf@w3.org>
Message-ID: <CAA4WUYhO0p1iDNM+PAo+_ttCQBTe7gRyViBzxzcjSfb6oXmW6Q@mail.gmail.com>
On Tue, Jul 15, 2014 at 1:22 PM, <bizzbyster@gmail.com> wrote:

> Comments inline…
>
> On Jul 15, 2014, at 1:07 PM, William Chan (陈智昌) <willchan@chromium.org>
> wrote:
>
> On Tue, Jul 15, 2014 at 9:55 AM, <bizzbyster@gmail.com> wrote:
>
>> "
>>>
>>>
>>>    - In order to make optimal use of a connection pool, it's valuable
>>>    to know the size of a resource ahead of time. Is it possible to include an
>>>    expected-size attribute?
>>>
>>> *If* we were to consider introducing this, I don't think its limited to
>> hints. As such, I think this is a separate discussion. "
>>
>> I think expected size is also important because it allows the UA to make
>> an informed decision about whether or not it wants to incur the possible
>> cost of preloading. For instance, if the UA detects that the network is
>> severely bandwidth constrained, it might decide to not issue preload
>> requests for very large objects due to the possibility that the resource
>> may never be requested. For instance, if the user clicks off the page
>> before it is fully loaded.
>>
>
> Can you illustrate with more examples? If the user clicks off the page
> before it is fully loaded, then the rendering engine cancels the resource
> requests, which will cancel the HTTP fetch. This takes a roundtrip to get
> origin servers not to send any more data.
>
>
> Yes. Look at this page:
> http://caffeinatetheweb.com/with_hints.html?max=100. The page consists of
> 100 prefetchable javascript files and 100 unprefetchable image files. It is
> constructed in such a way that each javascript/image pair is requested
> serially. After all of those are requested the page requests a large (~600
> KB) image file. Over a dialup connection, the 80% visible time for the page
> would be faster if the browser decided NOT to prefetch the large image
> file. Does that make sense?
>

Got it. Just to be clear, this is a completely different example than your
former case, since the former involved clicking off the page, whereas this
one involves bandwidth contention from resources on the same page. Is there
a reason we don't solve this case with flow control (TCP or HTTP/2)? Let
the flow control window shrink to 0 so the sender stops hogging the
bandwidth.


>
>> Similar to the argument Ilya made about the UA being in the best position
>> to decide the number of pre-connects for a given host, though I don't think
>> that is strictly true for pre-connects, it is surely in the best position
>> to decide whether or not the potential cost of prefetching outweighs the
>> benefit due to last mile bandwidth limitations. And "expected size" gives
>> the UA the information needed to make this determination.
>>
>
> Can you explain why bandwidth contention is better solved with clients
> holding back large requests rather than holding back low priority requests?
> jQuery is large, but we download it anyway because it's critical to
> rendering.
>
>
> I agree with you that priority is key. Unfortunately right now resource
> hint priority is completely based on the type of the object. This doesn't
> take into consideration size or where the object is on the page. For
> instance, as in this case, the large image is only visible after scrolling
> all the way to the bottom of the page and so it should have a lower
> priority than an identical image that is rendered at the top of the page.
>

I agree that priority is suboptimal. We should/are fixing that (we demote
priority for images below the viewport once we know what is actually in the
viewport, i.e. we layout the page). You give a good example of how priority
can be broken. But how does expected-size do better in your example?


>
>
>
>>
>> Ilya,
>>
>> Where do you think it's best to have this discussion? I think it's really
>> important.
>>
>> Thanks,
>>
>> Peter
>>
>> On Jul 10, 2014, at 5:41 PM, Ilya Grigorik <igrigorik@google.com> wrote:
>>
>> Peter, Guy, Mark, thanks for the feedback! Comments inline.
>>
>> On Tue, Jul 8, 2014 at 8:08 PM, Nottingham, Mark <mnotting@akamai.com>
>>  wrote:
>>>
>>> 2) The "Caching Grace Period" section seems a bit iffy to me... I
>>> wouldn't express this as being an override of the caching policy, but
>>> rather of it being applied *after* the cache; i.e. the rendering engine
>>> itself effectively caches it for next use.
>>>
>>
>> Yep, great point. Will suggested the same and I've took a run at
>> reworking that section. Let me know how this looks:
>> https://github.com/igrigorik/resource-hints/issues/5
>>
>>
>> On Thu, Jul 10, 2014 at 7:27 AM, <bizzbyster@gmail.com> wrote:
>>>
>>> On Jul 9, 2014, at 9:20 AM, "Podjarny, Guy" <gpodjarn@akamai.com> wrote:
>>>
>>>  Section 2.1 (preconnect):
>>>
>>>    - One delta between pre connect and dns-prefetch is cost to the UA.
>>>    My understanding is that establishing a connection is more expensive in
>>>    resource utilization than DNS prefetch too. Therefore, dns-prefetch may be
>>>    used more lightly, and perhaps it’s a reason to not think of dns-prefetch
>>>    as a preconnect.
>>>
>>> Fair enough. That said, I think this is something worth experimenting
>> with.. All modern browsers are already doing preconnects anyway (without
>> the hint and based on own heuristics), so I'm not sure that the
>> dns-prefetch upgrade would change the picture by much. I'll see if I can
>> get some data from HTTP Archive on how often dns-prefetch is used in the
>> wild.
>>
>>>
>>>    - For some domains (e.g. Your CDN domain), it’ll actually be helpful
>>>    to open multiple connections, not just one (assuming no SPDY/HTTP2).
>>>    Specifying a number sounds too wrong, but does it make sense to put a
>>>    weight factor on the preconnect? Maybe a “primary” vs “secondary” domain?
>>>    Could be getting into the diminishing returns space.
>>>
>>> I agree that it is helpful to open multiple connections for many domains
>>> but don't see a problem with specifying a number. The draft argues that the
>>> UA is "is in the best position to determine the optimal number" of
>>> connections per domain. But this is not always the case. If the server were
>>> able to receive and leverage feedback from browsers ("past request
>>> patterns" in the draft) then it could know more about the capabilities of
>>> various domains. For instance, we see some servers allow a large number of
>>> concurrent connections and others enforce strict low limits. I think it
>>> makes sense to include a suggested number of connections in the pre-connect
>>> hint. The UA is free to ignore that suggestion.
>>>
>>
>> I understand the motivation, but I still think this exposes knobs that
>> should be left to the user agent. The number of connections will vary by
>> users connection type, time of day, protocol, and so on, all of which are
>> dynamic. Browsers already track this kind of information and adapt their
>> logic to take this into account - e.g. chrome://dns/ (see "Expected
>> Connects" column). With HTTP/2 this is also unnecessary (and I say that
>> with awareness of your recent thread on http-wg on the subject :)).
>>
>>> Section 2.2 (preload):
>>>
>>>    - With today’s implementations, double downloading of preloaded
>>>    resources is a major issue. Would be good to make some explicit definitions
>>>    about how to handle a resource that has been requested as a preload
>>>    resource already and is now seen on the page. An obvious rule should be to
>>>    not double download, but others may be more complex (e.g. What if we
>>>    communicated a low prio via SPDY/HTTP2?)
>>>
>>> - Matching retained responses with requests:
>> https://igrigorik.github.io/resource-hints/#matching-request
>> - (Re)prioritization:
>> https://github.com/igrigorik/resource-hints/issues/1
>>
>>>
>>>    - Content type as text sounds a bit error prone. Would
>>>    “text/javascript” cover “x-application/javascript” too? Is there a way to
>>>    normalize content types?
>>>
>>> Jake proposed using "context" instead, which I really like, but need to
>> do some more digging on:
>> https://github.com/igrigorik/resource-hints/issues/6
>>
>>>
>>>    - Should preload resources delay unload? (my vote is no)
>>>
>>> Preload hints are for the *current* page. As a result they are cancelled
>> as part of onunload. If you need the request to span across navigations,
>> you should be using prefetch, which is used to load resources for next
>> navigation.
>>
>>>
>>>    - What should preload (and prefetch) do in case the resource request
>>>    got a temp error (e.g. 502)
>>>    - How should the UA handle cookies set in a response to a preload
>>>    (or prefetch)?
>>>
>>> Hint-initiated requests are not special in any way: think of any request
>> initiated by the preload scanner today, all the same behaviors here. If the
>> request fails, it fails.. it sends the same cookies, and so on.
>>
>>
>>> A few more to add to this list:
>>>
>>>    - How to handle the fact that cookies may have changed between the
>>>    requesting of the preload resource and the requesting of the resource by
>>>    the renderer either due to a Set-Cookie or a locally executed javascript?
>>>    Perhaps we could add an attribute that indicates that the resource is
>>>    cookie-sensitive or not?
>>>
>>> We don't. You have the same race condition today with the preload
>> scanner sending early fetches for JS/CSS.
>>
>>>
>>>    - The earlier spec discussed preload hints URLs that were generated
>>>    via javascript. I think this is very powerful as it allows the browser to
>>>    preload dynamically generated URLs. For instance, URLs with sessionID or
>>>    Date or rand appended to them.
>>>
>>> Yes, this is what preconnect covers. We don't know the URL, but if we
>> know the origin we can at least complete the handshake:
>> https://igrigorik.github.io/resource-hints/#preconnect (example 1)
>>
>>>
>>>    - By viewing past behavior, especially with resource priorities in
>>>    HTTP/2, the server can actually know quite a bit more than the browser
>>>    about resource priorities. It would be great if the server could provide
>>>    the UA with a hint as to the importance of the object. Is it a serializing
>>>    resource? Does visual completeness depend on it? This information could be
>>>    communicated via a simple low, medium, high type scheme and the spec could
>>>    provide description as to what these values mean.
>>>
>>> That's what type/context is meant to provide. Today we don't have a more
>> fine grained mechanism to communicate priorities. Perhaps we should, but I
>> think that's a separate discussion.
>>
>>>
>>>    - In order to make optimal use of a connection pool, it's valuable
>>>    to know the size of a resource ahead of time. Is it possible to include an
>>>    expected-size attribute?
>>>
>>> *If* we were to consider introducing this, I don't think its limited to
>> hints. As such, I think this is a separate discussion.
>>
>>>
>>>    - As per our other email thread, we should include object version
>>>    information in the preload hint so as to minimize cache re-validation
>>>    requests.
>>>
>>> I still think that subresource integrity is the right place to discuss
>> this: http://www.w3.org/TR/SRI/#caching-optional-1
>>
>>> Section 3.4 (Caching grace period):
>>>
>>>    - Since these hints are explicitly added, I think we can be a bit
>>>    more strict in what we require.
>>>    - My vote would be:
>>>       - For preload, do the same thing the preparser does, which I
>>>       believe means use the resource regardless of whether it’s cacheable, as
>>>       long as you’re in the midst of loading the current page (may require some
>>>       definition of when has the page finished loading, which I suspect the
>>>       preloader deals with too). If you go to another page, revert to the cache
>>>       instructions on the resource.
>>>
>>> +1
>>>
>>
>> Yes, this should be covered by the new "matching requests" section.
>>
>>>
>>>    - For prefetch & prerender, use the cache instructions (no grace
>>>    period or changes)
>>>
>>> This would cripple prefetch and prerender because most dynamic content
>> is marked as non cacheable. Think of prerender as opening a background tab
>> (or middle click, if you prefer), except that the tab is invisible and is
>> then instantly swapped-in on navigation as long as it hasn't expired
>> (insert reasonable TTL here.. Chrome uses 300 seconds).
>>
>>> A couple of additional questions:
>>>
>>>    - How would these hints, and specifically preload, interact with
>>>    Client Hints?
>>>
>>> Assuming CH is implemented, the relevant hints would be advertised on
>> the outbound request. Nothing special.
>>
>>>
>>>    - What about srcset and the picture element (e.g. Native conditional
>>>    loading mechanisms)?
>>>
>>> I don't see any concerns here. If you have conditional loading then you
>> must evaluate those conditions.. With native <picture> those conditions
>> will be executed by the preparser (yay) if the main doc parser is
>> blocked... Yes, you may not be able to stick a Link header hint or put a
>> <link> hint in the head of the doc, but such is the cost of conditional
>> fetches. On the other hand, if you *know* you need a specific file
>> regardless, feel free to hint it.
>>
>> ig
>>
>>
>>
>
>
Received on Wednesday, 16 July 2014 00:30:17 UTC