Re: ResourceTiming API & byte-size

I agree that byte size is important information which in some cases is not available on the server. The following scenarios come to my mind:

  *   CDNs
  *   Third Parties
  *   Content delivered via mobile providers. They sometimes mess around with the actual content
  *   Content modified by WAN optimization solutions

When we talk about size then. We have to distinguish between transport size and actual size. Eventually these are two different metrics.  I think you want to see both. The main concern I see with exposing this information is privacy. We already had several discussion on privacy issues around Resource Timing. If not part of Resource Timing this information can become part of the new diagnostics interface we are planning to invest in.

Bandwidth estimation is – conceptually – another problem. As you have pointed out, most bandwidth measurement today is done by downloading pictures of increasing size and then extrapolating the user's bandwidth. This is a true performance issue, especially when doing this over a mobile connection. I am however not sure whether resource size would really be helpful here. It will make things easier but if this is the main purpose there must be better ways to get this information.

// Alois


From: Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>>
Date: Friday, February 1, 2013 9:56 AM
To: James Simonsen <simonjam@chromium.org<mailto:simonjam@chromium.org>>
Cc: "mtomlins@westevergreen.com<mailto:mtomlins@westevergreen.com>" <mtomlins@westevergreen.com<mailto:mtomlins@westevergreen.com>>, Arvind Jain <arvind@google.com<mailto:arvind@google.com>>, Christian Biesinger <cbiesinger@gmail.com<mailto:cbiesinger@gmail.com>>, Jonas Sicking <jonas@sicking.cc<mailto:jonas@sicking.cc>>, "public-web-perf@w3.org<mailto:public-web-perf@w3.org>" <public-web-perf@w3.org<mailto:public-web-perf@w3.org>>
Subject: [dynatrace.com] Re: ResourceTiming API & byte-size
Resent-From: Alois Reitbauer <alois.reitbauer@dynatrace.com<mailto:alois.reitbauer@dynatrace.com>>, "public-web-perf@w3.org<mailto:public-web-perf@w3.org>" <public-web-perf@w3.org<mailto:public-web-perf@w3.org>>
Resent-Date: Friday, February 1, 2013 9:57 AM

Hi James,

You are assuming that the bytesize data is necessarily something that we'd want to send from client to server.
The use cases I have for the addition of byte size are for using it on the client:
* Client side waterfall charts & HAR files - as demonstrated at [1]. A real life use case can be client side detection of compression related performance issues.
* Client side heuristic bandwidth measurements - measuring bandwidth in the client is a hard problem [2]. Adding bytesize info to the Resource Timing API will enable the Web developer community to participate in trying to resolve that problem. In this case, the information is needed on the client, no need to send it to the server. Leaving byte size out of the Resource Timing API may result in developers *downloading* a map of resource-bytesize in order to perform bandwidth estimations.

Also, I'd like to add that the server is not always aware of intermediate proxies that may further compress its resources, especially on mobile networks (disclosure: I pass most of my time working on such an intermediate proxy). Therefore, the byte size information that the server has, even if it aggregates information from lower layers and CDNs, is not always accurate.

Yoav

[1] http://calendar.perfplanet.com/2012/an-introduction-to-the-resource-timing-api/
[2] http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0071.html


On Thu, Jan 31, 2013 at 11:51 PM, James Simonsen <simonjam@chromium.org<mailto:simonjam@chromium.org>> wrote:
I have the same issue with this as with including the protocol, which is being discussed on a separate thread. (I've CC'd them here.)

These new signals are useful for debugging periodically, but not worth burdening all of the hundreds of millions of clients performing billions of page loads every day. The timing data we put in Resource Timing and Navigation Timing are worth the cost, because they provide useful aggregate data that vary with the user population and collecting them in the client is the only way. Protocol and byte size don't meet these standards.

Also, relaying information from your CDN, through your thousands/millions of users, back to your server is wasteful. Many users are on metered connections, especially mobile. It's better to get the info directly from the CDN. Certainly the CDN can tell you how many bytes it sent and which protocol was used.

James


On Thu, Jan 31, 2013 at 10:44 AM, Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>> wrote:
Thanks Mark!

I just want to point out that Christian's point is valid for cross-origin resources, not same-origin resources. Therefore, as far as I'm concerned, we're discussing adding a "bytesize"/compressedBytesize" attributes for same-origin resources, not cross-origin ones.

Also, to stress my point even further, another case in which the server is not aware of resource sizes is when these resources (usually images) are optimized in the CDN. While CDNs will be considered cross-origin in most (all?) cases, it would be possible to add the CDN hosts as part of the "Timing-Allow-Origin" header value.

Yoav



On Thu, Jan 31, 2013 at 7:27 PM, Mark Tomlinson <mtomlins@westevergreen.com<mailto:mtomlins@westevergreen.com>> wrote:
Just to chime-in (been lurking a while)...

I think things have changed over time...Yoav has a good point in reply to James.  A few years ago it was "easy" to determine and measure the size for a resource requests from the server-side of the equation.  When I state "easy" I mean:
     - there is was a small group of IT engineers that would be studying performance...with specialized, focused skill
     - with a small controlled group, the security access to the server logs or console or admin interface was manageable
     - the identification of a resource was more simplistic, finding the object or base folder was more static or direct
     - other tools to measure (e.g. "sniff") resource payload in-between servers or client-server were allowed and

In current times I am observing:
    - there is an ever-growing population of diverse IT engineers (and non-engineers) getting involved in performance measurement, analysis and determination (not a bad thing, but the tools must evolve to help non-performance folks get it right)
    - exposing network and server information to this larger population of people is counter to the trend for infosec
    - as such the access to the logs or admin console on the server is more limited...which is good, but hinders access to size measurements from the server side
    - in more cases now, access to the server can be physically or legally denied (e.g. hosted on another platform, which you may never, every obtain access - but you are still pressured to measure performance and "size")
    - more so, the identification of the resource is more complex than before - dynamic content from multiple sources, combining resources from different parts of the pipeline or even on the client itself

Christian is right also that size is often used to determine "is user logged in to site X" - depending on the resource.  Perhaps that's not a discussion for this group - about spoofing or obfuscating size on secured resources?

Cheers and thanks,

-mt



On Thu, Jan 31, 2013 at 3:33 AM, Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>> wrote:
[X]
(Sorry for double posting, premature "send" :) )

In yesterday's meeting I saw that byte-size was mentioned, and James is opposed to adding that value:
"I feel that the interface should only include novel information that isn't easily available today. The server serving the resources already knows the size of the images."

Resource size is often *not* easily available to the server's logic (e.g. mod_pagespeed automatically optimized images, and as far as the server logic is concerned, it is a lower layer. Same for gzipped JS/CSS.)
Resource size information would enable:
* Creating full fledged waterfall charts in the browser
* Possibly estimating bandwidth [1] in JS, which can bring in much more people to solve the bandwidth estimation problem.

I agree that at least in some cases these applications can be done by having the server send over a manifest that includes a resource size map, but that is not always the case, and would require an extra download, while the browser already has that information.

Since we can agree that there's no security risk in adding that attribute for same-origin hosts, I don't see why the fact that this information may be available elsewhere prevents it from being added to the Resource Timing API.

Thanks,
Yoav

[1] http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0086.html


On Thu, Jan 31, 2013 at 9:27 AM, Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>> wrote:
In yesterday's meeting I saw that byte-size was mentioned, and james is opposed to adding that value:
"I feel that the interface should only include novel information that isn't easily available today. The server serving the resources already knows the size of the images."

Image size is often *not* easily available to the server's logic (e.g. mod_pagespeed automatically optimized images, and as far as the server logic is concerned, it is a lower layer)
Image size information would enable:
* Creating full fledged waterfall charts in the browser
* Possibly estimating bandwidth [1] in JS, which can bring in much more people to solve the bandwidth estimation problem.

I agree that at least in some cases these applications can be done by having the server send over a manifest that includes a resource size map, but that is not always the case, a


[1] http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0086.html


On Wed, Jan 30, 2013 at 2:11 AM, Arvind Jain <arvind@google.com<mailto:arvind@google.com>> wrote:
I looked through the archives for previous discussion on this, and I found only one thread:
http://lists.w3.org/Archives/Public/public-web-perf/2011Jul/0059.html

I suppose we could add the size field in next revision. We'll keep track of this for Resource Timing v2.


On Tue, Jan 29, 2013 at 3:49 PM, Christian Biesinger <cbiesinger@gmail.com<mailto:cbiesinger@gmail.com>> wrote:
Nvm, missed the "same origin" part of the question.

-christian


On Tue, Jan 29, 2013 at 3:48 PM, Christian Biesinger <cbiesinger@gmail.com<mailto:cbiesinger@gmail.com>> wrote:
Why do you say this poses no security risk? Byte size can reveal things like "is user logged in to site X", which is not something we should expose.

-christian


On Tue, Jan 29, 2013 at 3:39 PM, Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>> wrote:
Since it poses no security risk and can be extremely useful, can byte size information be added to same origin resources in a future version of the Resource Timing API?

Thanks,
Yoav


On Thu, Jan 3, 2013 at 12:23 AM, Jonas Sicking <jonas@sicking.cc<mailto:jonas@sicking.cc>> wrote:
On Wed, Jan 2, 2013 at 6:09 AM, Yoav Weiss <yoav@yoav.ws<mailto:yoav@yoav.ws>> wrote:
>
> I'm wondering if there's any reason the resource's byte-size & compressed
> byte-size were not added to the ResourceTiming API.
>
> The use cases I see for adding this information to the API would be:
> 1. Enabling Web apps access to the full information required to create
> complete waterfall charts or HAR files that are equivalent to the browser's
> own Web Inspector/developer tools. Current demos creating a waterfall chart
> [1] or HAR files [2] either ignore the file size altogether, or use the IE
> only property of "fileSize" on image resources.
> 2. Detecting compression issues using RUM scripts (text resources that
> were not GZIPed, automated image compression regressions).
> 3. Enabling Web applications to get a notion of the average download
> bandwidth each resource had. Even though such a measurement may not be
> accurate (because of slow-start, contention, packet losses, different hosts
> per resource, etc), it may be useful information either for RUM or for
> progressive enhancement purposes.

The byte size and the compressed byte size can't be exposed for
cross-origin loads. Other than that I don't see any security issues
with this.

/ Jonas









--
Mark Tomlinson | Independent Performance Consultant | +1.215-520-5450<tel:%2B1.215-520-5450> | mtomlins@westevergreen.com<mailto:mtomlins@westevergreen.com>  | mtomlins.blogspot.com<http://mtomlins.blogspot.com/> | @mtomlins<http://twitter.com/mtomlins>



The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Compuware Austria GmbH (registration number FN 91482h) is a company registered in Vienna whose registered office is at 1120 Wien, Austria, Am Euro Platz 2 / Gebäude G.

Received on Monday, 4 February 2013 10:34:02 UTC