- From: Ilya Grigorik <igrigorik@google.com>
- Date: Tue, 23 Dec 2014 15:16:54 -0500
- To: Steve Souders <steve@souders.org>
- Cc: Patrick Meenan <pmeenan@webpagetest.org>, Peter Lepeska <bizzbyster@gmail.com>, Nic Jansma <nic@nicj.net>, Yoav Weiss <yoav@yoav.ws>, public-web-perf <public-web-perf@w3.org>
- Message-ID: <CADXXVKpN8FNpJLV6y2J3sqWa-tcwbSa0fMuuKzFa_bMpuX0N_w@mail.gmail.com>
On Mon, Dec 22, 2014 at 3:22 PM, Steve Souders <steve@souders.org> wrote: > > Sure, but that's mostly an educational and easily fixable problem on > their end... Short of (2b) case. > > It's more than an educational problem. Developers typically look at code > and object properties before documentation and tutorials. "Duration" is > short and encompassing. It'll be the first choice. The people I've seen who > have already made this mistake come from smart, webperf cutting edge > organizations, as evidenced by the fact that they're using Resource Timing > in production systems. If the cutting edge gurus make the mistake it's > likely that we need more than education. > I disagree with this. I think the current definition of "duration" is correct and, in fact, is exactly what applications should be measuring: time from the moment you requested the resource to when it is available. This includes time to check the appropriate caches, which is non-zero and can be in tens and hundreds of milliseconds, connection setup time, blocking time due to head-of-line blocking (http/1 artifact), and the actual transfer times. If you want to exclusively measure the "network transfer time" and exclude cache and blocking overhead, then you should do that as a separate metric... I think what you're pointing out here is that most people assume that cache lookups are effectively free, and http/1 HoL is not a problem... and that, to me, is an education problem, not a metric problem. > Could we instrument HTTP Archive to log blocking time for each resource? > I accept pull requests. ;-) But given that the average website has 50+ > resources on a single hostname > <http://httparchive.org/trends.php#numDomains&maxDomainReqs> that's 44 > requests that have blocking time. > But not all of them are dispatched simultaneously either: some are delayed because they're declared later in the document, some have to wait for layout (e.g. CSS spec'ed resources), and others may be scheduled via JS, etc. It'd be good to understand how this looks in the real world. Good news is, looks like HAR already captures this in "timings: { blocked: ...}": http://www.softwareishard.com/blog/har-12-spec/#timings. I verified that both WPT and Chrome HAR export report the metric. So, we already have the data in the raw WPT results... "just" need to pull it out ;) > > But isn't this the same problem in a different disguise? > Yes, but not as significant. > We have research showing that even flash I/O can be very expensive [1], and it mirrors some of the metrics we've gathered in the past in Chrome.. Plus, in addition to slow I/O we also have thread hopping, etc, all of which adds non-trivial overhead. I'm not convinced we can just sweep this under the rug. ig [1] http://dl.acm.org/citation.cfm?id=2385607 > On 12/22/14 9:25 AM, Ilya Grigorik wrote: > > On Wed, Dec 17, 2014 at 9:53 PM, Steve Souders <steve@souders.org> wrote: > >> The use cases were CDNs, RUM providers, and website owners using Resource >> Timing's duration to measure (what they thought was) download time of >> resources. In fact, one of the RUM providers (Buddy from SOASTA) did a >> preso at WebPerfdays showing code to track "duration" and captured it in a >> property called "downloadtime" - so everyone in that audience now things >> "duration" means "download time". Bummer! >> > > Sure, but that's mostly an educational and easily fixable problem on > their end... Short of (2b) case. > > >> For the (2b) case (different origin & you don't control it so can't add >> TAO header), you're right that sometimes there's no action the website >> owner can take. For example, if the Twitter widget loads other scripts & >> images dynamically, there's not much the website owner can do. But there >> are *numerous* situations where the timing of (2b) content IS actionable. >> If the website owner was able to distinguish blocking time from download >> time they'd be able to make the right decision and take action. For example: >> - fonts - These are blocking the page from rendering. If it's because >> the fonts are slow to download, then I might want to switch font providers. >> If it's because of blocking, then I might want to preload or prefetch the >> fonts. >> - ads - I moved the ad in my page and clickthroughs dropped off >> significantly. Is that because the ad content is blocked or slow? Or >> something else? >> - JS libs - I might want to find out if >> https://code.jquery.com/jquery-2.1.2.min.js is loading slow on my site >> because it's blocked or just slow to download. Again, there are many >> actions the website owner can take - load it async, prefetch it, host it >> locally, get it from Google CDN. >> > > As an aside... I'm wondering if we can gather some data on how often > this is actually a problem? Could we instrument HTTP Archive to log > blocking time for each resource? > > >> Choosing a name is hard because I assume we do NOT want to reveal whether >> the object was read from cache for cross-origin resources. Thus, >> "networkDuration" could actually not involve any network requests at all. I >> thought about calling it "loadtime" since that covers loading it over the >> network or from cache. Again, I'm not insistent on "networkDuration" and >> would love better name brainstorming. >> > > But isn't this the same problem in a different disguise? I thought I was > measuring the latency of my CDN, but I'm actually measuring latency of my > cache lookup plus the CDN fetch, where the former can easily take tens if > not hundreds of milliseconds.. and crazily enough, be higher than the > actual network fetch. > > ig > > >> On 12/4/14 9:13 AM, Ilya Grigorik wrote: >> >> On Mon, Nov 24, 2014 at 4:34 PM, Steve Souders <steve@souders.org> >> wrote: >> >>> LONG: A few weeks ago I discovered that "duration" includes blocking >>> time, so "duration" is greater than the actual network time needed to >>> download the resource. Since then I've been at Velocity and WebPerfDays >>> where many people have shown their Resource Timing code. Everyone I spoke >>> to (~5 different teams) assumed that "duration" was just the network time. >>> When I explain that it also includes blocking they were surprised, admitted >>> they hadn't known that, and agreed it is NOT the metric they were trying to >>> capture. >> >> >> Steve, can you elaborate on the use case a bit more? Who's measuring >> what here, and for what purpose? Are we benchmarking CDN performance? >> >> In terms of getting access to the data, we have the following cases: >> 1) same origin resources: full access to timing data. >> 2) different origin: >> a) if you control it, add TAO header for full access to timing data. >> b) if you don't control it, you only have "duration" >> >> For (1) and (2a), I can see why you may want or need to get low-level >> "network duration" data: you want to track your provider's DNS performance, >> latency to your CDN, TTFB, total response time, and so on. You care about >> this because this is something *you can affect*. However, for (2b)... this >> same data falls into interesting but not actionable bucket? Further, it >> seems like if you are actually interested in benchmarking your CDN, then >> you really should be looking deeper than just total time: you want to >> decompose DNS, TCP, TLS, HTTP req>resp cycles. At which point.. you need >> the full timing object anyway. >> >> I propose we add a new property to Resource Timing that reflects the >>> time to actually load the resource excluding blocking time. I'm flexible >>> about the name but for purposes of this discussion let's call it >>> "networkDuration". The important piece of this proposal is that >>> "networkDuration" should be available for all resources, similar to >>> "duration". In other words, it should be available for same origin as well >>> as cross origin resources as part of the PerformanceEntry >>> <http://www.w3.org/TR/performance-timeline/#performanceentry> interface. >>> >> >> Note that "blocking time" is a thing of the past for SPDY and HTTP/2, >> as this demo demonstrates really well: http://www.httpvshttps.com/ >> >> I'm skeptical of above definition: if you want "network duration", you >> should also exclude cache time; it's a computed metric that you can access >> today with TAO and a redundant one with http/2; if you really care about >> "network duration" you should probably decompose it further, but at that >> point it becomes a conversation about removing the TAO restriction. >> >> ig >> >> P.S. "networkDuration = dns + tcp + waiting + content" ... don't forget >> the https handshake! >> >> On Wed, Nov 26, 2014 at 9:01 AM, Patrick Meenan <pmeenan@webpagetest.org> >> wrote: >> >>> Would be great to see it either as a high-level duration or as an >>> unblocking of the redirectStart time for cross-origin (though it may still >>> not be clear to people that that is the time they really care about). >>> >>> I expect the current logic was the easiest and didn't require any >>> privacy reviews because it's quite literally the exact same detail that you >>> get if you do it manually in javascript by creating an element and >>> listening to the onload. Even if the more-granular detail doesn't really >>> expose anything you couldn't figure out before it does provide additional >>> detail that wouldn't otherwise be measurable and is probably going to >>> require reviews by privacy and security teams. >>> >>> On Wed, Nov 26, 2014 at 9:36 AM, Peter Lepeska <bizzbyster@gmail.com> >>> wrote: >>> >>>> +1 >>>> >>>> On Tue, Nov 25, 2014 at 12:31 PM, Nic Jansma <nic@nicj.net> wrote: >>>> >>>>> Good point! Hadn't considered that, so yes I would agree it's a >>>>> very valuable addition to consider. >>>>> >>>>> As far as what interface to put it on, I'm not sure networkDuration >>>>> would make sense for UserTiming, for example. While it could sit on >>>>> PerformanceEntry and just be "0" for interfaces that aren't applicable, we >>>>> could also create a PerformanceNetworkEntry interface (with >>>>> networkDuration) that PerformanceResourceTiming inherits from, while >>>>> PerformanceUserTiming only inherits from PerformanceEntry. >>>>> >>>>> That's all minor details though. Really depends on the browser >>>>> privacy teams OK'ing the addition. >>>>> >>>>> - Nichttp://nicj.net/ >>>>> @NicJ >>>>> >>>>> On 11/25/2014 12:16 PM, Steve Souders wrote: >>>>> >>>>> Nic - >>>>> >>>>> You can *not* calculate networkDuration from other attributes for >>>>> *cross-origin* resources. That's why I'm suggesting adding this to >>>>> PerformanceEntry (rather than PerformanceResourceTiming). >>>>> >>>>> And as mentioned, about 50% of resources are cross-origin so it's >>>>> important to provide a means for *accurate* download time measurements. >>>>> >>>>> -Steve >>>>> >>>>> >>>>> On 11/25/14, 8:02 AM, Nic Jansma wrote: >>>>> >>>>> Steve, >>>>> >>>>> The only downside I see is that we're adding a new attribute that can >>>>> be entirely calculated via other attributes. >>>>> >>>>> One alternate (or additional thing) would be to highlight this point >>>>> in the description for "duration" in the spec. >>>>> >>>>> - Nichttp://nicj.net/ >>>>> @NicJ >>>>> >>>>> On 11/25/2014 3:04 AM, Yoav Weiss wrote: >>>>> >>>>> >>>>> On Tue, Nov 25, 2014 at 1:34 AM, Steve Souders <steve@souders.org> >>>>> wrote: >>>>> >>>>>> SHORT: I propose we add the "networkDuration" property to >>>>>> PerformanceEntry >>>>>> <http://www.w3.org/TR/performance-timeline/#performanceentry> >>>>>> objects. >>>>>> >>>>>> LONG: A few weeks ago I discovered that "duration" includes blocking >>>>>> time, so "duration" is greater than the actual network time needed to >>>>>> download the resource. Since then I've been at Velocity and WebPerfDays >>>>>> where many people have shown their Resource Timing code. Everyone I spoke >>>>>> to (~5 different teams) assumed that "duration" was just the network time. >>>>>> When I explain that it also includes blocking they were surprised, admitted >>>>>> they hadn't known that, and agreed it is NOT the metric they were trying to >>>>>> capture. >>>>>> >>>>>> I propose we add a new property to Resource Timing that reflects the >>>>>> time to actually load the resource excluding blocking time. I'm flexible >>>>>> about the name but for purposes of this discussion let's call it >>>>>> "networkDuration". The important piece of this proposal is that >>>>>> "networkDuration" should be available for all resources, similar to >>>>>> "duration". In other words, it should be available for same origin as well >>>>>> as cross origin resources as part of the PerformanceEntry >>>>>> <http://www.w3.org/TR/performance-timeline/#performanceentry> >>>>>> interface. >>>>>> >>>>>> Same origin resources can calculate "networkDuration" as follows >>>>>> (assume "r" is a PerformanceResourceTiming >>>>>> <http://?ui=2&ik=b493d86064&view=att&th=149e4608a5dad0d6&attid=0.1.1&disp=emb&zw&atsh=0> >>>>>> object): >>>>>> >>>>>> dns = r.domainLookupEnd - r.domainLookupStart; >>>>>> tcp = r.connectEnd - r.connectStart; // includes ssl >>>>>> negotiation >>>>>> waiting = r.responseStart - r.requestStart; // aka "TTFB" >>>>>> content = r.responseEnd - r.responseStart; >>>>>> networkDuration = dns + tcp + waiting + content; >>>>>> >>>>>> I've discussed this with a few people and the only concern I've heard >>>>>> is with regard to privacy along the lines of "if we exclude blocking we've >>>>>> added the ability to distinguish cache reads from network fetches". This >>>>>> isn't an issue for two reasons: >>>>>> >>>>>> 1. Even with the exclusion of blocking time, it's still possible >>>>>> for "networkDuration" to have a non-zero value for resources read from >>>>>> cache due to disk access time, etc. Therefore, excluding blocking time does >>>>>> not necessarily provide a clear means of determining resources read from >>>>>> cache. >>>>>> 2. This concern assumes that adding "networkDuration" lessens >>>>>> privacy because removing blocking time provides additional information that >>>>>> is not available today. However, it's possible to exclude blocking time >>>>>> today by loading a cross-origin resource after window.onload, when there is >>>>>> no blocking contention. >>>>>> >>>>>> Therefore, individuals who have JavaScript access to a page and can >>>>>> measure duration also have enough access to load resources after >>>>>> window.onload and can thus determine the duration excluding blocking time. >>>>>> Adding "networkDuration" does not give these individuals additional >>>>>> information beyond what is measurable today. >>>>>> >>>>>> What "networkDuration" provides is additional information for the >>>>>> normal case of resources that are loaded as part of the main page when >>>>>> blocking contention may occur. This will give current web developers the >>>>>> metric they want for cross-origin resources, and will provide it more >>>>>> simply for same origin resources. >>>>>> >>>>> >>>>> Assuming that the privacy concerns are in fact non-existent, a big >>>>> +1. >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> > >
Received on Tuesday, 23 December 2014 20:18:02 UTC