Re: ResourceTiming API & byte-size from Yoav Weiss on 2013-02-01 (public-web-perf@w3.org from February 2013)

From: Yoav Weiss <yoav@yoav.ws>
Date: Fri, 1 Feb 2013 09:56:45 +0100
To: James Simonsen <simonjam@chromium.org>
Cc: mtomlins@westevergreen.com, Arvind Jain <arvind@google.com>, Christian Biesinger <cbiesinger@gmail.com>, Jonas Sicking <jonas@sicking.cc>, public-web-perf <public-web-perf@w3.org>
Message-ID: <CACj=BEhz7rg4obJYv1q=jvksEmEZGChKhmhOwY_Go4qChpMOyA@mail.gmail.com>
Hi James,

You are assuming that the bytesize data is necessarily something that we'd
want to send from client to server.
The use cases I have for the addition of byte size are for using it on the
client:
* Client side waterfall charts & HAR files - as demonstrated at [1]. A real
life use case can be client side detection of compression related
performance issues.
* Client side heuristic bandwidth measurements - measuring bandwidth in the
client is a hard problem [2]. Adding bytesize info to the Resource Timing
API will enable the Web developer community to participate in trying to
resolve that problem. In this case, the information is needed on the
client, no need to send it to the server. Leaving byte size out of the
Resource Timing API may result in developers *downloading* a map of
resource-bytesize in order to perform bandwidth estimations.

Also, I'd like to add that the server is not always aware of intermediate
proxies that may further compress its resources, especially on mobile
networks (disclosure: I pass most of my time working on such an
intermediate proxy). Therefore, the byte size information that the server
has, even if it aggregates information from lower layers and CDNs, is not
always accurate.

Yoav

[1]
http://calendar.perfplanet.com/2012/an-introduction-to-the-resource-timing-api/
[2] http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0071.html


On Thu, Jan 31, 2013 at 11:51 PM, James Simonsen <simonjam@chromium.org>wrote:

> I have the same issue with this as with including the protocol, which is
> being discussed on a separate thread. (I've CC'd them here.)
>
> These new signals are useful for debugging periodically, but not worth
> burdening all of the hundreds of millions of clients performing billions of
> page loads every day. The timing data we put in Resource Timing and
> Navigation Timing are worth the cost, because they provide useful aggregate
> data that vary with the user population and collecting them in the client
> is the only way. Protocol and byte size don't meet these standards.
>
> Also, relaying information from your CDN, through your thousands/millions
> of users, back to your server is wasteful. Many users are on metered
> connections, especially mobile. It's better to get the info directly from
> the CDN. Certainly the CDN can tell you how many bytes it sent and which
> protocol was used.
>
> James
>
>
> On Thu, Jan 31, 2013 at 10:44 AM, Yoav Weiss <yoav@yoav.ws> wrote:
>
>> Thanks Mark!
>>
>> I just want to point out that Christian's point is valid for cross-origin
>> resources, not same-origin resources. Therefore, as far as I'm concerned,
>> we're discussing adding a "bytesize"/compressedBytesize" attributes for
>> same-origin resources, not cross-origin ones.
>>
>> Also, to stress my point even further, another case in which the server
>> is not aware of resource sizes is when these resources (usually images) are
>> optimized in the CDN. While CDNs will be considered cross-origin in most
>> (all?) cases, it would be possible to add the CDN hosts as part of the
>> "Timing-Allow-Origin" header value.
>>
>> Yoav
>>
>>
>>
>> On Thu, Jan 31, 2013 at 7:27 PM, Mark Tomlinson <
>> mtomlins@westevergreen.com> wrote:
>>
>>> Just to chime-in (been lurking a while)...
>>>
>>> I think things have changed over time...Yoav has a good point in reply
>>> to James.  A few years ago it was "easy" to determine and measure the size
>>> for a resource requests from the server-side of the equation.  When I
>>> state "easy" I mean:
>>>      - there is was a small group of IT engineers that would be studying
>>> performance...with specialized, focused skill
>>>      - with a small controlled group, the security access to the server
>>> logs or console or admin interface was manageable
>>>      - the identification of a resource was more simplistic, finding the
>>> object or base folder was more static or direct
>>>      - other tools to measure (e.g. "sniff") resource payload in-between
>>> servers or client-server were allowed and
>>>
>>> In current times I am observing:
>>>     - there is an ever-growing population of diverse IT engineers (and
>>> non-engineers) getting involved in performance measurement, analysis and
>>> determination (not a bad thing, but the tools must evolve to help
>>> non-performance folks get it right)
>>>     - exposing network and server information to this larger population
>>> of people is counter to the trend for infosec
>>>     - as such the access to the logs or admin console on the server is
>>> more limited...which is good, but hinders access to size measurements from
>>> the server side
>>>     - in more cases now, access to the server can be physically or
>>> legally denied (e.g. hosted on another platform, which you may never, every
>>> obtain access - but you are still pressured to measure performance and
>>> "size")
>>>     - more so, the identification of the resource is more complex than
>>> before - dynamic content from multiple sources, combining resources from
>>> different parts of the pipeline or even on the client itself
>>>
>>> Christian is right also that size is often used to determine "is user
>>> logged in to site X" - depending on the resource.  Perhaps that's not a
>>> discussion for this group - about spoofing or obfuscating size on secured
>>> resources?
>>>
>>> Cheers and thanks,
>>>
>>> -mt
>>>
>>>
>>>
>>> On Thu, Jan 31, 2013 at 3:33 AM, Yoav Weiss <yoav@yoav.ws> wrote:
>>>
>>>> (Sorry for double posting, premature "send" :) )
>>>>
>>>> In yesterday's meeting I saw that byte-size was mentioned, and James is
>>>> opposed to adding that value:
>>>> "I feel that the interface should only include novel information that
>>>> isn't easily available today. The server serving the resources already
>>>> knows the size of the images."
>>>>
>>>> Resource size is often *not* easily available to the server's logic
>>>> (e.g. mod_pagespeed automatically optimized images, and as far as the
>>>> server logic is concerned, it is a lower layer. Same for gzipped JS/CSS.)
>>>> Resource size information would enable:
>>>> * Creating full fledged waterfall charts in the browser
>>>> * Possibly estimating bandwidth [1] in JS, which can bring in much more
>>>> people to solve the bandwidth estimation problem.
>>>>
>>>> I agree that at least in some cases these applications can be done by
>>>> having the server send over a manifest that includes a resource size map,
>>>> but that is not always the case, and would require an extra download, while
>>>> the browser already has that information.
>>>>
>>>> Since we can agree that there's no security risk in adding that
>>>> attribute for same-origin hosts, I don't see why the fact that this
>>>> information may be available elsewhere prevents it from being added to the
>>>> Resource Timing API.
>>>>
>>>> Thanks,
>>>> Yoav
>>>>
>>>> [1]
>>>> http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0086.html
>>>>
>>>>
>>>> On Thu, Jan 31, 2013 at 9:27 AM, Yoav Weiss <yoav@yoav.ws> wrote:
>>>>
>>>>> In yesterday's meeting I saw that byte-size was mentioned, and james
>>>>> is opposed to adding that value:
>>>>> "I feel that the interface should only include novel information that
>>>>> isn't easily available today. The server serving the resources already
>>>>> knows the size of the images."
>>>>>
>>>>> Image size is often *not* easily available to the server's logic (e.g.
>>>>> mod_pagespeed automatically optimized images, and as far as the server
>>>>> logic is concerned, it is a lower layer)
>>>>> Image size information would enable:
>>>>> * Creating full fledged waterfall charts in the browser
>>>>> * Possibly estimating bandwidth [1] in JS, which can bring in much
>>>>> more people to solve the bandwidth estimation problem.
>>>>>
>>>>> I agree that at least in some cases these applications can be done by
>>>>> having the server send over a manifest that includes a resource size map,
>>>>> but that is not always the case, a
>>>>>
>>>>>
>>>>> [1]
>>>>> http://lists.w3.org/Archives/Public/public-device-apis/2013Jan/0086.html
>>>>>
>>>>>
>>>>> On Wed, Jan 30, 2013 at 2:11 AM, Arvind Jain <arvind@google.com>wrote:
>>>>>
>>>>>> I looked through the archives for previous discussion on this, and I
>>>>>> found only one thread:
>>>>>> http://lists.w3.org/Archives/Public/public-web-perf/2011Jul/0059.html
>>>>>>
>>>>>> I suppose we could add the size field in next revision. We'll keep
>>>>>> track of this for Resource Timing v2.
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 29, 2013 at 3:49 PM, Christian Biesinger <
>>>>>> cbiesinger@gmail.com> wrote:
>>>>>>
>>>>>>> Nvm, missed the "same origin" part of the question.
>>>>>>>
>>>>>>> -christian
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 29, 2013 at 3:48 PM, Christian Biesinger <
>>>>>>> cbiesinger@gmail.com> wrote:
>>>>>>>
>>>>>>>> Why do you say this poses no security risk? Byte size can reveal
>>>>>>>> things like "is user logged in to site X", which is not something we should
>>>>>>>> expose.
>>>>>>>>
>>>>>>>> -christian
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 29, 2013 at 3:39 PM, Yoav Weiss <yoav@yoav.ws> wrote:
>>>>>>>>
>>>>>>>>> Since it poses no security risk and can be extremely useful, can
>>>>>>>>> byte size information be added to same origin resources in a future version
>>>>>>>>> of the Resource Timing API?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Yoav
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jan 3, 2013 at 12:23 AM, Jonas Sicking <jonas@sicking.cc>wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Jan 2, 2013 at 6:09 AM, Yoav Weiss <yoav@yoav.ws> wrote:
>>>>>>>>>> >
>>>>>>>>>> > I'm wondering if there's any reason the resource's byte-size &
>>>>>>>>>> compressed
>>>>>>>>>> > byte-size were not added to the ResourceTiming API.
>>>>>>>>>> >
>>>>>>>>>> > The use cases I see for adding this information to the API
>>>>>>>>>> would be:
>>>>>>>>>> > 1. Enabling Web apps access to the full information required to
>>>>>>>>>> create
>>>>>>>>>> > complete waterfall charts or HAR files that are equivalent to
>>>>>>>>>> the browser's
>>>>>>>>>> > own Web Inspector/developer tools. Current demos creating a
>>>>>>>>>> waterfall chart
>>>>>>>>>> > [1] or HAR files [2] either ignore the file size altogether, or
>>>>>>>>>> use the IE
>>>>>>>>>> > only property of "fileSize" on image resources.
>>>>>>>>>> > 2. Detecting compression issues using RUM scripts (text
>>>>>>>>>> resources that
>>>>>>>>>> > were not GZIPed, automated image compression regressions).
>>>>>>>>>> > 3. Enabling Web applications to get a notion of the average
>>>>>>>>>> download
>>>>>>>>>> > bandwidth each resource had. Even though such a measurement may
>>>>>>>>>> not be
>>>>>>>>>> > accurate (because of slow-start, contention, packet losses,
>>>>>>>>>> different hosts
>>>>>>>>>> > per resource, etc), it may be useful information either for RUM
>>>>>>>>>> or for
>>>>>>>>>> > progressive enhancement purposes.
>>>>>>>>>>
>>>>>>>>>> The byte size and the compressed byte size can't be exposed for
>>>>>>>>>> cross-origin loads. Other than that I don't see any security
>>>>>>>>>> issues
>>>>>>>>>> with this.
>>>>>>>>>>
>>>>>>>>>> / Jonas
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Mark Tomlinson | Independent Performance Consultant | +1.215-520-5450 |
>>> mtomlins@westevergreen.com  | mtomlins.blogspot.com | @mtomlins<http://twitter.com/mtomlins>
>>>
>>
>>
>
Received on Friday, 1 February 2013 08:57:13 UTC