Re: [resource-hints] first spec draft

More comments…

On Jul 15, 2014, at 8:29 PM, William Chan (陈智昌) <willchan@chromium.org> wrote:

> On Tue, Jul 15, 2014 at 1:22 PM, <bizzbyster@gmail.com> wrote:
> Comments inline…
> 
> On Jul 15, 2014, at 1:07 PM, William Chan (陈智昌) <willchan@chromium.org> wrote:
> 
>> On Tue, Jul 15, 2014 at 9:55 AM, <bizzbyster@gmail.com> wrote:
>> "
>> In order to make optimal use of a connection pool, it's valuable to know the size of a resource ahead of time. Is it possible to include an expected-size attribute?
>> *If* we were to consider introducing this, I don't think its limited to hints. As such, I think this is a separate discussion. "
>> 
>> I think expected size is also important because it allows the UA to make an informed decision about whether or not it wants to incur the possible cost of preloading. For instance, if the UA detects that the network is severely bandwidth constrained, it might decide to not issue preload requests for very large objects due to the possibility that the resource may never be requested. For instance, if the user clicks off the page before it is fully loaded.
>> 
>> Can you illustrate with more examples? If the user clicks off the page before it is fully loaded, then the rendering engine cancels the resource requests, which will cancel the HTTP fetch. This takes a roundtrip to get origin servers not to send any more data.
> 
> Yes. Look at this page: http://caffeinatetheweb.com/with_hints.html?max=100. The page consists of 100 prefetchable javascript files and 100 unprefetchable image files. It is constructed in such a way that each javascript/image pair is requested serially. After all of those are requested the page requests a large (~600 KB) image file. Over a dialup connection, the 80% visible time for the page would be faster if the browser decided NOT to prefetch the large image file. Does that make sense?
> 
> Got it. Just to be clear, this is a completely different example than your former case, since the former involved clicking off the page, whereas this one involves bandwidth contention from resources on the same page. Is there a reason we don't solve this case with flow control (TCP or HTTP/2)? Let the flow control window shrink to 0 so the sender stops hogging the bandwidth.

I am assuming the user clicks away at the 80% visible time so this isn't really a different example. But it doesn't matter -- there are many different scenarios where knowing the expected size of the resource hint helps the UA make smarter decisions and so load the page faster. Flow control could mitigate the impact of this yes but it's suboptimal as in many cases flow control is too late -- we have already filled the pipe for some period of time unnecessarily with data we don't need. Worse still, it takes a round trip to tell the sender to stop. Worse than that IMO, flow control is complex enough if our primary goal is to maximize the use of the available bandwidth. To add realtime prioritization into the mix makes it really easy to create edge case bugs and, as you and Roberto have reported many times, flow control bugs can be very hard to catch and fix.
> 
> 
>> 
>> Similar to the argument Ilya made about the UA being in the best position to decide the number of pre-connects for a given host, though I don't think that is strictly true for pre-connects, it is surely in the best position to decide whether or not the potential cost of prefetching outweighs the benefit due to last mile bandwidth limitations. And "expected size" gives the UA the information needed to make this determination.
>> 
>> Can you explain why bandwidth contention is better solved with clients holding back large requests rather than holding back low priority requests? jQuery is large, but we download it anyway because it's critical to rendering.
> 
> I agree with you that priority is key. Unfortunately right now resource hint priority is completely based on the type of the object. This doesn't take into consideration size or where the object is on the page. For instance, as in this case, the large image is only visible after scrolling all the way to the bottom of the page and so it should have a lower priority than an identical image that is rendered at the top of the page.
> 
> I agree that priority is suboptimal. We should/are fixing that (we demote priority for images below the viewport once we know what is actually in the viewport, i.e. we layout the page). You give a good example of how priority can be broken. But how does expected-size do better in your example?

Prefetching objects based on resource hints is speculative, meaning there is always the risk that the UA will fetch bytes unnecessarily. For that reason we need to provide the UA with information that allow it to make an intelligent decision about whether or not it's worth the risk. In the case of a bandwidth constrained link, cost will be directly proportional to size so the UA needs size to make this decision. The point I'm making is that while priority tells the UA the order in which resources should be fetched to optimally load the page for best user experience, and it will often be related to the probability that the object will be needed, it doesn't tell the UA anything about the cost of issuing what could be an unnecessary request.

Your comment triggers an idea -- it would be great if we could send a boolean flag to indicate that the object will be needed to render the portion of the page within the viewport. Thoughts?

>  
> 
>>  
>> 
>> Ilya,
>> 
>> Where do you think it's best to have this discussion? I think it's really important.
>> 
>> Thanks,
>> 
>> Peter
>> 
>> On Jul 10, 2014, at 5:41 PM, Ilya Grigorik <igrigorik@google.com> wrote:
>> 
>>> Peter, Guy, Mark, thanks for the feedback! Comments inline.
>>> 
>>> On Tue, Jul 8, 2014 at 8:08 PM, Nottingham, Mark <mnotting@akamai.com> wrote:
>>> 2) The "Caching Grace Period" section seems a bit iffy to me... I wouldn't express this as being an override of the caching policy, but rather of it being applied *after* the cache; i.e. the rendering engine itself effectively caches it for next use.
>>> 
>>> Yep, great point. Will suggested the same and I've took a run at reworking that section. Let me know how this looks:
>>> https://github.com/igrigorik/resource-hints/issues/5
>>> 
>>> 
>>> On Thu, Jul 10, 2014 at 7:27 AM, <bizzbyster@gmail.com> wrote:
>>> On Jul 9, 2014, at 9:20 AM, "Podjarny, Guy" <gpodjarn@akamai.com> wrote:
>>>> Section 2.1 (preconnect):
>>>> One delta between pre connect and dns-prefetch is cost to the UA. My understanding is that establishing a connection is more expensive in resource utilization than DNS prefetch too. Therefore, dns-prefetch may be used more lightly, and perhaps it’s a reason to not think of dns-prefetch as a preconnect.
>>> 
>>> Fair enough. That said, I think this is something worth experimenting with.. All modern browsers are already doing preconnects anyway (without the hint and based on own heuristics), so I'm not sure that the dns-prefetch upgrade would change the picture by much. I'll see if I can get some data from HTTP Archive on how often dns-prefetch is used in the wild.
>>>> For some domains (e.g. Your CDN domain), it’ll actually be helpful to open multiple connections, not just one (assuming no SPDY/HTTP2). Specifying a number sounds too wrong, but does it make sense to put a weight factor on the preconnect? Maybe a “primary” vs “secondary” domain? Could be getting into the diminishing returns space.
>>> 
>>> I agree that it is helpful to open multiple connections for many domains but don't see a problem with specifying a number. The draft argues that the UA is "is in the best position to determine the optimal number" of connections per domain. But this is not always the case. If the server were able to receive and leverage feedback from browsers ("past request patterns" in the draft) then it could know more about the capabilities of various domains. For instance, we see some servers allow a large number of concurrent connections and others enforce strict low limits. I think it makes sense to include a suggested number of connections in the pre-connect hint. The UA is free to ignore that suggestion.
>>> 
>>> I understand the motivation, but I still think this exposes knobs that should be left to the user agent. The number of connections will vary by users connection type, time of day, protocol, and so on, all of which are dynamic. Browsers already track this kind of information and adapt their logic to take this into account - e.g. chrome://dns/ (see "Expected Connects" column). With HTTP/2 this is also unnecessary (and I say that with awareness of your recent thread on http-wg on the subject :)). 
>>>> Section 2.2 (preload):
>>>> With today’s implementations, double downloading of preloaded resources is a major issue. Would be good to make some explicit definitions about how to handle a resource that has been requested as a preload resource already and is now seen on the page. An obvious rule should be to not double download, but others may be more complex (e.g. What if we communicated a low prio via SPDY/HTTP2?)
>>> 
>>> - Matching retained responses with requests: https://igrigorik.github.io/resource-hints/#matching-request
>>> - (Re)prioritization: https://github.com/igrigorik/resource-hints/issues/1 
>>>> Content type as text sounds a bit error prone. Would “text/javascript” cover “x-application/javascript” too? Is there a way to normalize content types? 
>>> 
>>> Jake proposed using "context" instead, which I really like, but need to do some more digging on:
>>> https://github.com/igrigorik/resource-hints/issues/6
>>>> Should preload resources delay unload? (my vote is no)
>>> 
>>> Preload hints are for the *current* page. As a result they are cancelled as part of onunload. If you need the request to span across navigations, you should be using prefetch, which is used to load resources for next navigation.
>>>> What should preload (and prefetch) do in case the resource request got a temp error (e.g. 502)
>>>> How should the UA handle cookies set in a response to a preload (or prefetch)?
>>> 
>>> Hint-initiated requests are not special in any way: think of any request initiated by the preload scanner today, all the same behaviors here. If the request fails, it fails.. it sends the same cookies, and so on. 
>>>  
>>> A few more to add to this list:
>>> How to handle the fact that cookies may have changed between the requesting of the preload resource and the requesting of the resource by the renderer either due to a Set-Cookie or a locally executed javascript? Perhaps we could add an attribute that indicates that the resource is cookie-sensitive or not?
>>> We don't. You have the same race condition today with the preload scanner sending early fetches for JS/CSS.  
>>> The earlier spec discussed preload hints URLs that were generated via javascript. I think this is very powerful as it allows the browser to preload dynamically generated URLs. For instance, URLs with sessionID or Date or rand appended to them.
>>> Yes, this is what preconnect covers. We don't know the URL, but if we know the origin we can at least complete the handshake:
>>> https://igrigorik.github.io/resource-hints/#preconnect (example 1)
>>> By viewing past behavior, especially with resource priorities in HTTP/2, the server can actually know quite a bit more than the browser about resource priorities. It would be great if the server could provide the UA with a hint as to the importance of the object. Is it a serializing resource? Does visual completeness depend on it? This information could be communicated via a simple low, medium, high type scheme and the spec could provide description as to what these values mean.
>>> That's what type/context is meant to provide. Today we don't have a more fine grained mechanism to communicate priorities. Perhaps we should, but I think that's a separate discussion. 
>>> In order to make optimal use of a connection pool, it's valuable to know the size of a resource ahead of time. Is it possible to include an expected-size attribute?
>>> *If* we were to consider introducing this, I don't think its limited to hints. As such, I think this is a separate discussion. 
>>> As per our other email thread, we should include object version information in the preload hint so as to minimize cache re-validation requests.
>>> I still think that subresource integrity is the right place to discuss this: http://www.w3.org/TR/SRI/#caching-optional-1
>>>> Section 3.4 (Caching grace period):
>>>> Since these hints are explicitly added, I think we can be a bit more strict in what we require.
>>>> My vote would be:
>>>> For preload, do the same thing the preparser does, which I believe means use the resource regardless of whether it’s cacheable, as long as you’re in the midst of loading the current page (may require some definition of when has the page finished loading, which I suspect the preloader deals with too). If you go to another page, revert to the cache instructions on the resource.
>>> +1
>>> 
>>> Yes, this should be covered by the new "matching requests" section.
>>>> For prefetch & prerender, use the cache instructions (no grace period or changes)
>>> 
>>> This would cripple prefetch and prerender because most dynamic content is marked as non cacheable. Think of prerender as opening a background tab (or middle click, if you prefer), except that the tab is invisible and is then instantly swapped-in on navigation as long as it hasn't expired (insert reasonable TTL here.. Chrome uses 300 seconds).
>>>> A couple of additional questions:
>>>> How would these hints, and specifically preload, interact with Client Hints? 
>>> 
>>> Assuming CH is implemented, the relevant hints would be advertised on the outbound request. Nothing special.
>>>> What about srcset and the picture element (e.g. Native conditional loading mechanisms)?
>>> 
>>> I don't see any concerns here. If you have conditional loading then you must evaluate those conditions.. With native <picture> those conditions will be executed by the preparser (yay) if the main doc parser is blocked... Yes, you may not be able to stick a Link header hint or put a <link> hint in the head of the doc, but such is the cost of conditional fetches. On the other hand, if you *know* you need a specific file regardless, feel free to hint it.
>>> 
>>> ig
>> 
>> 
> 
> 

Received on Wednesday, 16 July 2014 13:56:49 UTC