- From: Steve Souders <steve@souders.org>
- Date: Tue, 23 Dec 2014 12:25:14 -0800
- To: Ilya Grigorik <igrigorik@google.com>
- CC: Patrick Meenan <pmeenan@webpagetest.org>, Peter Lepeska <bizzbyster@gmail.com>, Nic Jansma <nic@nicj.net>, Yoav Weiss <yoav@yoav.ws>, public-web-perf <public-web-perf@w3.org>
- Message-ID: <5499CFAA.1070004@souders.org>
> I think the current definition of "duration" is correct
I've never questioned that the definition of "duration" is incorrect.
Instead, I'm suggesting that we add a new metric called something like
"networkDuration".
> If you want to exclusively measure the "network transfer time" and
exclude cache and blocking overhead, then you should do that as a
separate metric
Yes, that's it exactly.
-Steve
On 12/23/14 12:16 PM, Ilya Grigorik wrote:
> On Mon, Dec 22, 2014 at 3:22 PM, Steve Souders <steve@souders.org
> <mailto:steve@souders.org>> wrote:
>
> > Sure, but that's mostly an educational and easily fixable
> problem on their end... Short of (2b) case.
>
> It's more than an educational problem. Developers typically look
> at code and object properties before documentation and tutorials.
> "Duration" is short and encompassing. It'll be the first choice.
> The people I've seen who have already made this mistake come from
> smart, webperf cutting edge organizations, as evidenced by the
> fact that they're using Resource Timing in production systems. If
> the cutting edge gurus make the mistake it's likely that we need
> more than education.
>
>
> I disagree with this. I think the current definition of "duration" is
> correct and, in fact, is exactly what applications should be
> measuring: time from the moment you requested the resource to when it
> is available. This includes time to check the appropriate caches,
> which is non-zero and can be in tens and hundreds of milliseconds,
> connection setup time, blocking time due to head-of-line blocking
> (http/1 artifact), and the actual transfer times.
>
> If you want to exclusively measure the "network transfer time" and
> exclude cache and blocking overhead, then you should do that as a
> separate metric... I think what you're pointing out here is that most
> people assume that cache lookups are effectively free, and http/1 HoL
> is not a problem... and that, to me, is an education problem, not a
> metric problem.
>
> > Could we instrument HTTP Archive to log blocking time for each
> resource?
> I accept pull requests. ;-) But given that the average website has
> 50+ resources on a single hostname
> <http://httparchive.org/trends.php#numDomains&maxDomainReqs>
> that's 44 requests that have blocking time.
>
>
> But not all of them are dispatched simultaneously either: some are
> delayed because they're declared later in the document, some have to
> wait for layout (e.g. CSS spec'ed resources), and others may be
> scheduled via JS, etc. It'd be good to understand how this looks in
> the real world.
>
> Good news is, looks like HAR already captures this in "timings: {
> blocked: ...}":
> http://www.softwareishard.com/blog/har-12-spec/#timings. I verified
> that both WPT and Chrome HAR export report the metric. So, we already
> have the data in the raw WPT results... "just" need to pull it out ;)
>
> > But isn't this the same problem in a different disguise?
> Yes, but not as significant.
>
>
> We have research showing that even flash I/O can be very expensive
> [1], and it mirrors some of the metrics we've gathered in the past in
> Chrome.. Plus, in addition to slow I/O we also have thread hopping,
> etc, all of which adds non-trivial overhead. I'm not convinced we can
> just sweep this under the rug.
>
> ig
>
> [1] http://dl.acm.org/citation.cfm?id=2385607
>
> On 12/22/14 9:25 AM, Ilya Grigorik wrote:
>> On Wed, Dec 17, 2014 at 9:53 PM, Steve Souders <steve@souders.org
>> <mailto:steve@souders.org>> wrote:
>>
>> The use cases were CDNs, RUM providers, and website owners
>> using Resource Timing's duration to measure (what they
>> thought was) download time of resources. In fact, one of the
>> RUM providers (Buddy from SOASTA) did a preso at WebPerfdays
>> showing code to track "duration" and captured it in a
>> property called "downloadtime" - so everyone in that audience
>> now things "duration" means "download time". Bummer!
>>
>>
>> Sure, but that's mostly an educational and easily fixable problem
>> on their end... Short of (2b) case.
>>
>> For the (2b) case (different origin & you don't control it so
>> can't add TAO header), you're right that sometimes there's no
>> action the website owner can take. For example, if the
>> Twitter widget loads other scripts & images dynamically,
>> there's not much the website owner can do. But there are
>> *numerous* situations where the timing of (2b) content IS
>> actionable. If the website owner was able to distinguish
>> blocking time from download time they'd be able to make the
>> right decision and take action. For example:
>> - fonts - These are blocking the page from rendering. If
>> it's because the fonts are slow to download, then I might
>> want to switch font providers. If it's because of blocking,
>> then I might want to preload or prefetch the fonts.
>> - ads - I moved the ad in my page and clickthroughs
>> dropped off significantly. Is that because the ad content is
>> blocked or slow? Or something else?
>> - JS libs - I might want to find out if
>> https://code.jquery.com/jquery-2.1.2.min.js is loading slow
>> on my site because it's blocked or just slow to download.
>> Again, there are many actions the website owner can take -
>> load it async, prefetch it, host it locally, get it from
>> Google CDN.
>>
>>
>> As an aside... I'm wondering if we can gather some data on how
>> often this is actually a problem? Could we instrument HTTP
>> Archive to log blocking time for each resource?
>>
>> Choosing a name is hard because I assume we do NOT want to
>> reveal whether the object was read from cache for
>> cross-origin resources. Thus, "networkDuration" could
>> actually not involve any network requests at all. I thought
>> about calling it "loadtime" since that covers loading it over
>> the network or from cache. Again, I'm not insistent on
>> "networkDuration" and would love better name brainstorming.
>>
>>
>> But isn't this the same problem in a different disguise? I
>> thought I was measuring the latency of my CDN, but I'm actually
>> measuring latency of my cache lookup plus the CDN fetch, where
>> the former can easily take tens if not hundreds of milliseconds..
>> and crazily enough, be higher than the actual network fetch.
>>
>> ig
>>
>> On 12/4/14 9:13 AM, Ilya Grigorik wrote:
>>> On Mon, Nov 24, 2014 at 4:34 PM, Steve Souders
>>> <steve@souders.org <mailto:steve@souders.org>> wrote:
>>>
>>> LONG: A few weeks ago I discovered that "duration"
>>> includes blocking time, so "duration" is greater than
>>> the actual network time needed to download the resource.
>>> Since then I've been at Velocity and WebPerfDays where
>>> many people have shown their Resource Timing code.
>>> Everyone I spoke to (~5 different teams) assumed that
>>> "duration" was just the network time. When I explain
>>> that it also includes blocking they were surprised,
>>> admitted they hadn't known that, and agreed it is NOT
>>> the metric they were trying to capture.
>>>
>>>
>>> Steve, can you elaborate on the use case a bit more? Who's
>>> measuring what here, and for what purpose? Are we
>>> benchmarking CDN performance?
>>>
>>> In terms of getting access to the data, we have the
>>> following cases:
>>> 1) same origin resources: full access to timing data.
>>> 2) different origin:
>>> a) if you control it, add TAO header for full access to
>>> timing data.
>>> b) if you don't control it, you only have "duration"
>>>
>>> For (1) and (2a), I can see why you may want or need to get
>>> low-level "network duration" data: you want to track your
>>> provider's DNS performance, latency to your CDN, TTFB, total
>>> response time, and so on. You care about this because this
>>> is something *you can affect*. However, for (2b)... this
>>> same data falls into interesting but not actionable bucket?
>>> Further, it seems like if you are actually interested in
>>> benchmarking your CDN, then you really should be looking
>>> deeper than just total time: you want to decompose DNS, TCP,
>>> TLS, HTTP req>resp cycles. At which point.. you need the
>>> full timing object anyway.
>>>
>>> I propose we add a new property to Resource Timing that
>>> reflects the time to actually load the resource
>>> excluding blocking time. I'm flexible about the name but
>>> for purposes of this discussion let's call it
>>> "networkDuration". The important piece of this proposal
>>> is that "networkDuration" should be available for all
>>> resources, similar to "duration". In other words, it
>>> should be available for same origin as well as cross
>>> origin resources as part of the PerformanceEntry
>>> <http://www.w3.org/TR/performance-timeline/#performanceentry> interface.
>>>
>>>
>>> Note that "blocking time" is a thing of the past for SPDY
>>> and HTTP/2, as this demo demonstrates really well:
>>> http://www.httpvshttps.com/
>>>
>>> I'm skeptical of above definition: if you want "network
>>> duration", you should also exclude cache time; it's a
>>> computed metric that you can access today with TAO and a
>>> redundant one with http/2; if you really care about "network
>>> duration" you should probably decompose it further, but at
>>> that point it becomes a conversation about removing the TAO
>>> restriction.
>>>
>>> ig
>>>
>>> P.S. "networkDuration = dns + tcp + waiting + content" ...
>>> don't forget the https handshake!
>>>
>>> On Wed, Nov 26, 2014 at 9:01 AM, Patrick Meenan
>>> <pmeenan@webpagetest.org <mailto:pmeenan@webpagetest.org>>
>>> wrote:
>>>
>>> Would be great to see it either as a high-level duration
>>> or as an unblocking of the redirectStart time for
>>> cross-origin (though it may still not be clear to people
>>> that that is the time they really care about).
>>>
>>> I expect the current logic was the easiest and didn't
>>> require any privacy reviews because it's quite literally
>>> the exact same detail that you get if you do it manually
>>> in javascript by creating an element and listening to
>>> the onload. Even if the more-granular detail doesn't
>>> really expose anything you couldn't figure out before it
>>> does provide additional detail that wouldn't otherwise
>>> be measurable and is probably going to require reviews
>>> by privacy and security teams.
>>>
>>> On Wed, Nov 26, 2014 at 9:36 AM, Peter Lepeska
>>> <bizzbyster@gmail.com <mailto:bizzbyster@gmail.com>> wrote:
>>>
>>> +1
>>>
>>> On Tue, Nov 25, 2014 at 12:31 PM, Nic Jansma
>>> <nic@nicj.net <mailto:nic@nicj.net>> wrote:
>>>
>>> Good point! Hadn't considered that, so yes I
>>> would agree it's a very valuable addition to
>>> consider.
>>>
>>> As far as what interface to put it on, I'm not
>>> sure networkDuration would make sense for
>>> UserTiming, for example. While it could sit on
>>> PerformanceEntry and just be "0" for interfaces
>>> that aren't applicable, we could also create a
>>> PerformanceNetworkEntry interface (with
>>> networkDuration) that PerformanceResourceTiming
>>> inherits from, while PerformanceUserTiming only
>>> inherits from PerformanceEntry.
>>>
>>> That's all minor details though. Really depends
>>> on the browser privacy teams OK'ing the addition.
>>>
>>> - Nic
>>> http://nicj.net/
>>> @NicJ
>>>
>>> On 11/25/2014 12:16 PM, Steve Souders wrote:
>>>> Nic -
>>>>
>>>> You can *not* calculate networkDuration from
>>>> other attributes for *cross-origin* resources.
>>>> That's why I'm suggesting adding this to
>>>> PerformanceEntry (rather than
>>>> PerformanceResourceTiming).
>>>>
>>>> And as mentioned, about 50% of resources are
>>>> cross-origin so it's important to provide a
>>>> means for *accurate* download time measurements.
>>>>
>>>> -Steve
>>>>
>>>>
>>>> On 11/25/14, 8:02 AM, Nic Jansma wrote:
>>>>> Steve,
>>>>>
>>>>> The only downside I see is that we're adding a
>>>>> new attribute that can be entirely calculated
>>>>> via other attributes.
>>>>>
>>>>> One alternate (or additional thing) would be
>>>>> to highlight this point in the description for
>>>>> "duration" in the spec.
>>>>> - Nic
>>>>> http://nicj.net/
>>>>> @NicJ
>>>>> On 11/25/2014 3:04 AM, Yoav Weiss wrote:
>>>>>>
>>>>>> On Tue, Nov 25, 2014 at 1:34 AM, Steve
>>>>>> Souders <steve@souders.org
>>>>>> <mailto:steve@souders.org>> wrote:
>>>>>>
>>>>>> SHORT: I propose we add the
>>>>>> "networkDuration" property to
>>>>>> PerformanceEntry
>>>>>> <http://www.w3.org/TR/performance-timeline/#performanceentry>
>>>>>> objects.
>>>>>>
>>>>>> LONG: A few weeks ago I discovered that
>>>>>> "duration" includes blocking time, so
>>>>>> "duration" is greater than the actual
>>>>>> network time needed to download the
>>>>>> resource. Since then I've been at
>>>>>> Velocity and WebPerfDays where many
>>>>>> people have shown their Resource Timing
>>>>>> code. Everyone I spoke to (~5 different
>>>>>> teams) assumed that "duration" was just
>>>>>> the network time. When I explain that it
>>>>>> also includes blocking they were
>>>>>> surprised, admitted they hadn't known
>>>>>> that, and agreed it is NOT the metric
>>>>>> they were trying to capture.
>>>>>>
>>>>>> I propose we add a new property to
>>>>>> Resource Timing that reflects the time to
>>>>>> actually load the resource excluding
>>>>>> blocking time. I'm flexible about the
>>>>>> name but for purposes of this discussion
>>>>>> let's call it "networkDuration". The
>>>>>> important piece of this proposal is that
>>>>>> "networkDuration" should be available for
>>>>>> all resources, similar to "duration". In
>>>>>> other words, it should be available for
>>>>>> same origin as well as cross origin
>>>>>> resources as part of the PerformanceEntry
>>>>>> <http://www.w3.org/TR/performance-timeline/#performanceentry>
>>>>>> interface.
>>>>>>
>>>>>> Same origin resources can calculate
>>>>>> "networkDuration" as follows (assume "r"
>>>>>> is a PerformanceResourceTiming
>>>>>> <http://?ui=2&ik=b493d86064&view=att&th=149e4608a5dad0d6&attid=0.1.1&disp=emb&zw&atsh=0>
>>>>>> object):
>>>>>>
>>>>>> dns = r.domainLookupEnd -
>>>>>> r.domainLookupStart;
>>>>>> tcp = r.connectEnd - r.connectStart; //
>>>>>> includes ssl negotiation
>>>>>> waiting = r.responseStart -
>>>>>> r.requestStart; // aka "TTFB"
>>>>>> content = r.responseEnd - r.responseStart;
>>>>>> networkDuration = dns + tcp + waiting +
>>>>>> content;
>>>>>>
>>>>>> I've discussed this with a few people and
>>>>>> the only concern I've heard is with
>>>>>> regard to privacy along the lines of "if
>>>>>> we exclude blocking we've added the
>>>>>> ability to distinguish cache reads from
>>>>>> network fetches". This isn't an issue for
>>>>>> two reasons:
>>>>>>
>>>>>> 1. Even with the exclusion of blocking
>>>>>> time, it's still possible for
>>>>>> "networkDuration" to have a non-zero
>>>>>> value for resources read from cache
>>>>>> due to disk access time, etc.
>>>>>> Therefore, excluding blocking time
>>>>>> does not necessarily provide a clear
>>>>>> means of determining resources read
>>>>>> from cache.
>>>>>> 2. This concern assumes that adding
>>>>>> "networkDuration" lessens privacy
>>>>>> because removing blocking time
>>>>>> provides additional information that
>>>>>> is not available today. However, it's
>>>>>> possible to exclude blocking time
>>>>>> today by loading a cross-origin
>>>>>> resource after window.onload, when
>>>>>> there is no blocking contention.
>>>>>>
>>>>>> Therefore, individuals who have
>>>>>> JavaScript access to a page and can
>>>>>> measure duration also have enough access
>>>>>> to load resources after window.onload and
>>>>>> can thus determine the duration excluding
>>>>>> blocking time. Adding "networkDuration"
>>>>>> does not give these individuals
>>>>>> additional information beyond what is
>>>>>> measurable today.
>>>>>>
>>>>>> What "networkDuration" provides is
>>>>>> additional information for the normal
>>>>>> case of resources that are loaded as part
>>>>>> of the main page when blocking contention
>>>>>> may occur. This will give current web
>>>>>> developers the metric they want for
>>>>>> cross-origin resources, and will provide
>>>>>> it more simply for same origin resources.
>>>>>>
>>>>>>
>>>>>> Assuming that the privacy concerns are in
>>>>>> fact non-existent, a big +1.
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
Received on Tuesday, 23 December 2014 20:25:51 UTC