- From: Kyle Simpson <getify@gmail.com>
- Date: Wed, 23 Mar 2011 12:59:04 -0500
- To: "Nic Jansma" <Nic.Jansma@microsoft.com>, <public-web-perf@w3.org>
> 1) Resources that are already in the browser's disk cache, for example, > from loading the page yesterday: *would* be included in the RT arrays. > Examples id="4" and id="5" below show this. Agreed, these definitely need to be included. Will there be some flag that indicates where it came from ("cache", "network", etc)? I think there definitely should be. > 2) Resources that are referred to multiple times in the same page ("1a", > "1b", "1c"): Our current thoughts are that these resources *should not* > be included, as my understanding is that all current browsers optimize > this case and do not initiate network requests for duplicate resource URIs > (eg, for "1b" and "1c", the browser wouldn't get 1.jpg again). > Additionally, these types of resources are not shown in browsers' Net > panels, or from a network sniffer. Actually, I think this assumption is not entirely correct. I have a JavaScript loader called LABjs, and in some browsers it operates in a "cache preloading" hacky method, where it makes a request for a script using a method that is guaranteed to download but *not* execute it (either by using a fake mime-type, or by using a <object> or Image container). Then, when appropriate, a second proper script element request is made for the same URL resource, making the assumption of course that the previous request successfully cached it. This second request being from a proper container/type, of course it then executes. But, the point is, in that scenario, in almost all browsers, I see both requests logged (IE9, Firefox, Chrome, etc). That's because the browser will still have to pull that second request from the browser cache. So, the fine point distinction you must be making is, did the second "duplicate" request happen to overlap while the first one was still loading or not. If it overlaps, I guess you're saying that browsers don't make a second/duplicate request. But if the two don't overlap, as in the "cache preload" technique I describe, then the browser and tools clearly do indicate a second load, even though it's "on the same page". So, from a consistency standpoint, there being sort of a race condition as to when the second request gets initiated, I think it'd be a bad idea not to include *all* resource requests, as sometimes the list would have those "duplicates" and sometimes it wouldn't, which will seem like non-deterministic/race-condition'y behavior to most who don't know all the finer details. Also, even if the browser does have short-circuit logic in it to not make the second request, AFAIK, the browser still must make some base assumptions about the contents of the resource, which can affect the browser's timing behavior for other actions. For instance, if I have 3 script elements spread across my DOM, all asking for the same script resource, the browser has to assume that the resource may have a `document.write()` in it, in which case it has to "block" everything else in the DOM/page from rendering, for *each* script element, until it runs that script. So, even though only one actual "request" may have gone beyond the HTML parser layer, the presence of 3 requesting containers is valuable information that in fact affects timings. > I could see including them for other reasons -- completeness of listing > all of the resources, whether or not they were retrieved from the network. > But to me, it seems like listing every resource on the page would include > a lot of redundant data, many of them without network latencies. Perhaps the reason for my slightly different thinking on this is because I'm assuming that "Resource Timing" means more than just "network request layer timing", but the overall full timing of end-to-end requesting a resource through to when the "request" for that resource is fulfilled. In that broader definition, even duplicate requests "cost" some time, and should therefore be logged/accounted for in some respect. > I would agree with you that the HTTP status code of the resource should > not exclude it from the RT array. 404/500/etc should all be included. If > the browser "initiates" a request, whether or not it was completed, we > should include it in the array. Yes, and furthermore, I think (mostly for data filtering purposes) having the status code actually in the data structure would be important. For instance, if a tool analyzing this data wants to filter out all 404's, etc. Same goes for, as mentioned above, having an indicator of where the resource came from ("network", "cache", "cache-revalidated", "cache-duplicate", etc). --Kyle
Received on Wednesday, 23 March 2011 17:59:45 UTC