Re: WebPerfWG call - March 4th @ 10am PST from Yoav Weiss on 2021-03-09 (public-web-perf@w3.org from March 2021)

From: Yoav Weiss <yoav@yoav.ws>
Date: Tue, 9 Mar 2021 13:01:46 +0100
To: public-web-perf <public-web-perf@w3.org>
Message-ID: <CACj=BEhagYCRiej_Ji2VVfNjgpneQ=D=OCRAQizhi-xFT_1DnQ@mail.gmail.com>
Minutes
<https://w3c.github.io/web-performance/meetings/2021/2021-03-04/index.html>
and recording <https://www.youtube.com/watch?v=IogUBQXK7KA> from last
week's meeting are now available.

Copying them here for convenience.

WebPerfWG call - March 4th 2021
Participants

Nic Jansma, Yoav Weiss, Noam Helfman, Giacomo Zecchini, Noam Rosenthal,
Patrick Meenan, Michal Mocny, Peter Perlepes, Michelle Vu, Sean Feng,
Gilles Dubuc, Nicolás Peña Moreno, Benjamin De Kosnik, Annie Sullivan
Next Meeting

March 18 @ 10am PST / 1pm EST
TopicsResource Timing Fetch integration - Noam Rosenthal

   - Noam R: Up until now, ResourceTiming specifies at which points during
   network request response the measurements will take place
   - … Creates parallel algorithm to what already exists in Fetch
   - … What we’re trying to do is define in Fetch when metrics should be
   taken
   - … In some cases this might be vague or open to interpretation
   - … Trying to make this more precise
   - … Also some timestamps in HTTP (TLS, DNS, etc)
   - … Having Fetch gather the metrics and pass to ResourceTiming to
   normalize before it’s exposed to web developers
   - *Yoav*: A lot of the processing model and the somewhat conflicting
   “getting” algorithms will mote to a structure in Fetch that contains that
   information, and will report that info to ResourceTiming spec
   - … Design that’s much closer to how implementations currently do that
   (at least for Chromium and Webkit)
   - … Good way to make spec closer to what implementations do
   - … In the process, there’s scrutiny around what they’re exposing.
   Hoping to find ourselves with all the attributes well-defined regarding
   what they expose.
   - *Nic*: NavigationTiming is built on top of Resource Timing. Are we
   planning to also tackle that? Or is that phase 2?
   - *Noam R*: That will be in Phase 2. We’re starting with Resource
   Timing. A lot of the infrastructure would be similar.
   - *Nic*: Is the goal to try to maintain compatibility to how we report
   timing today? Will there be browsers closer to the Fetch-based version than
   others?
   - *Noam R*: I think it will be somewhere in the middle.  We want the
   timestamps to be well defined, to avoid interoperability issues, and we
   want WPTs to reflect that.
   - … If there are parts in implementation that can be reverse-engineered
   to be well-defined, that would be pretty close. In other cases, it might be
   better to change implementations.
   - *Yoav*: We can definitely not break content, but in terms of timing
   there are MAYs or things open to interpretation, that we’d want to clamp
   down.
   - *Nic*: There’s already divergence in implementations. If there’s
   anything we need to redefine, I just want us to be transparent about that,
   e.g. for older browsers vs. future browsers.
   - *Noam R*: A lot of things so far are around Timing Allow Origin, and
   what is exposed.  Or under-defined things like workerStart
   - *Yoav*: workerStart will be discussed soon, as well as nextHopProtocol
   should be TAO protected

worker start needs to be added to diagram · Issue #128 ·
w3c/navigation-timing
<https://www.google.com/url?q=https://github.com/w3c/navigation-timing/issues/128&sa=D&source=editors&ust=1615294075256000&usg=AOvVaw1gJe_Xa_p7zRvCrMLP2XvE>
 - slides
<https://www.google.com/url?q=https://docs.google.com/presentation/d/1r3FwT1UTo7lpjZvYe-YV7cNAee8co-qCxIU5SdERalQ/edit?usp%3Dsharing&sa=D&source=editors&ust=1615294075256000&usg=AOvVaw3d_LM3JDkoFoSH3P0Qh1ff>

   - *Nic*: relevant for both NavTiming and ResourceTiming
   - … Nebulous term capturing when a SW was started
   - … Was meant to measure the costs of starting up a SW
   - … Added to the spec in 2015. Implemented only in Chrome, attribute
   exists elsewhere but always 0
   - … In Chrome it’s always less than or equal fetchStart
   - … kinda happens ahead of redirects
   - … It’s kinda confusing what it means and what we intend to do with it
   - … It’s missing from the diagram and there’s an open issue on that
   - … I started looking into where it should fit and there are a bunch of
   edge cases where if it would be implemented as specced it would be not what
   we want
   - … For example, if you have same origin redirects => workerStart will
   always 0
   - … If we didn’t do that, workerStart would always be for the last
   document
   - … Cross-origin redirects result in time for the cross-origin document
   and not the final document, and that information we want to avoid
   - … Also same origin means is different when there are no redirects
   - … There are also some questions on when workerStart should be in the
   diagram - before fetchStart or after it?
   - … fetchStart is after all the redirects but before AppCache
   - … workerStart happens before that, before the redirects - when the
   worker had to start up
   - … But according to the definition, it’s actually the opposite. So some
   definitions indicate that it should be after fetchStart.
   - … General consensus about it happening before fetchStart, but
   inconsistent today.
   - … The original goal: how do I measure how long it took for the sync
   service worker to start up?
   - … Original PR describes fetchStart-workerStart gives you SW start time
    (at least in mPulse)
   - … You can’t actually use it that way if there are any redirects, so
   that would also include redirect time
   - … There are also SW scopes, so you could have multiple SW starting if
   you have redirects between different paths on your origin
   - … So there’s a proposal to change this to have a “worker ready”
   timestamp, so you can definitely say how long startup took
   - … Could be confusing if you have multiple scopes
   - … Not using the same words as Resource Timing, so maybe should work
   closer with Noam
   - … There’s also very little test coverage
   - … Definitely want to lock it once we make changes
   - … Today it would be hard for Mozilla or Safari to implement because of
   that
   - …
   - … Some of the cases are addressed as well
   - … But still some other cases that are not yet addressed
   - … Would make sense to integrate all that with Noam’s work on Fetch
   integration
   - …
   - … Wanted to bring everyone up to speed and see if we need to merge it
   - *Noam R*: Something that came up while you were talking about
   redirects. Everything in Resource Timing has a 1-1 mapping with the
   request. But here you want to count multiple times the worker starts
   - … I wonder if it should be a separate entry. Maybe the worker start is
   not part of the request in a way
   - *Nic*: If all we want to answer is SW start time, maybe we can expose
   that directly.
   - … In RT we collapse redirect reporting. You could have SW start at
   different points in time. So maybe it makes more sense to have a
   workerStartupTotalTime
   - *Noam R*: yeah, otherwise you’re counting time for redirects
   - *Benjamin*: So great to see this effort. Mozilla just started looking
   into and Service Worker profiling. We start it after fetchStart. We have
   this thing called FetchEvent, synthsizes response and the beginning of
   AppCache.
   - … Really like the bracketing approach, but there can be different
   things. Taking what Noam says to heart, and wondering if we need a more
   fine-grained approach.
   - *Nic*: Confusion today on where workerStart falls depends on your
   definition of fetchStart.
   - … In some cases it’s the entry point to the Fetch spec, and then
   workerStart is after fetchStart. In Resource Timing fetchStart is more at a
   later point.
   - … Reporting total worker startup time would get around those issues as
   well.
   - *Noam R*: It makes sense that when there are things that are not 1-1
   with requests, we have new entries for them. If we have 3 workers, why not
   have an entry for each one?
   - … Trying to aggregate them into an existing entry makes it complex.
   - *Nic*: Would help with debugging if one of many was slower than
   others, you can figure it out
   - *Noam R*: Can imagine doing this with dedicated workers as well
   - *Nic*: and similarly, same-origin redirects could use the same
   breakdown
   - *Yoav*: We’ve had an open issue about that for same-origin redirects.
   Could be interesting to think about this as a separate problem.
   - … Thought of inviting folks from the Service Workers WG and have a
   joint meeting.
   - <administrative talk about scheduling>

`timeOrigin` is not clamped · Issue #105 · w3c/hr-time
<https://www.google.com/url?q=https://github.com/w3c/hr-time/issues/105&sa=D&source=editors&ust=1615294075262000&usg=AOvVaw30ikkjTeD-X4OPqulaFA6V>
 & PR#106
<https://www.google.com/url?q=https://github.com/w3c/hr-time/pull/106/&sa=D&source=editors&ust=1615294075262000&usg=AOvVaw1opaIgJY75ICdHJAXW_b8_>

   - *Yoav*: timeOrigin is not clamped and is exposing the raw time
   - … No particular vulnerability we found with this, but no use-case
   either, so seemed healthier to just clamp it
   - … PR 106
   - … If any objections, please chime in on issue

Make nextHopProtocol TAO protected
<https://www.google.com/url?q=https://github.com/w3c/resource-timing/pull/224&sa=D&source=editors&ust=1615294075263000&usg=AOvVaw0O4CVIOVB2W7_sCBt0Yhn6>


   - *Yoav*: Better to make it TAO protected
   - … Some issues if we don’t, seems safer to
   - … If objections, please chime in on issue

Event Timing and Scroll Timing - Nicolás - slides
<https://www.google.com/url?q=https://docs.google.com/presentation/d/1qVdMlqgi9uuyx9imCauzMjLGHQK6TGOIZV_RnlGBKis&sa=D&source=editors&ust=1615294075264000&usg=AOvVaw0sy_i6c_XmnjarxUuigCOV>

   - *Nicolás*: Wanted to talk about how we’re thinking of improving
   responsiveness metrics
   - … FID brief recap - delta between event timestamp and the time that
   event handlers for it begin running.
   - … Some successes: encourage people to break up their long tasks
   - … Tap delay issue surfaced
   - … A lot of sites have this issue (some sites have fixed using mobile
   viewports)
   - … Chrome is planning on getting rid of the delay for some of the
   existing cases
   - … Want to improve FID: aware of its limitations
   - … Consider more than just first input
   - … Potentially include scroll begin
   - … Evaluate a large chunk of end-to-end latency
   - … For evaluating more than just the first, multiple associated events
   - … Keyboard up/down
   - … Taps or drags with pointer/touch/mouse/[start|end|up|down] + click
   - … For tap interaction, what is the tap latency
   - … For drag, just the drag start/end of the interaction.  Harder to
   think about if they’re moving the content, so just the initial drag.
   - … Scrolls are not necessarily blocked on any event, so they’re a
   little more complicated.  At least on Chrome, you need to have non-passive
   listeners to block scrolling on events.
   - … Measure the initial scroll latency, not the event listeners
   triggered by the user-gesture that caused the scroll
   - … Need to associate events to interactions (keydown-keyup).  Need to
   allow developers to do this if possible.
   - … interactionID so keydown/keyup would have a shared ID
   - … The other problem is there is no way to measure scroll performance.
   Can measure with event handlers but that will introduce latency.
   - … The idea is that scrolling is not blocked on event handlers, so
   impossible to measure this accurately now
   - … One use case is that developers can force scrolling to happen in the
   slow path, and when that happens, they should be made aware of it, and
   that’s likely to be slower
   - … Need a way to measure initial scroll interaction (scroll begin --
   latency between touch move that triggers scroll to the time when the
   browser first reacts to that scroll - paint)
   - There can also be use cases for subsequent scroll reactions - for long
   scrolls operations, subsequent ones are called scroll updates. My initial
   reaction is that yhey would be better addressed by the Frame Timing API, as
   they represent frame latency.
   - … Two ideas how we could expose this
   - … One - Extend PerformanceEventTiming with new entry, even if it
   doesn’t have an associated event like the other entries do,
   name=”scrollBegin”
   - … Two - a new PerformanceScrollTiming entry, since some of the
   PerformanceEventTiming attributes are not meaningful for this
   (processingEnd, etc)
   - … Happy to hear folks’ ideas on that front
   - … Evaluating end-to-end latency of an interaction
   -
   - … Starts with input event timestamp from user hardware
   - … Then there may be some blocking tasks already in queue to run
   - … Next is processing time of event handlers
   - … A frame may be presented on the screen
   - … After is work kicked off by event handlers (async)
   - … The final frame is presented on the screen
   - … For the current EventTiming, we measure the first yellow star
   (.duration value)
   - … Async work is hard to track
   - … We can improve over FID by looking at next frame presented on screen
   - … In future we’d like to explore looking at what those red tasks
   should be
   - … But ATM mostly thinking of exposing the first yellow start (first
   frame presented on screen), which is currently exposed in Event Timing with
   duration.
   - *Noam H*: Comment on scroll - main idea makes sense because it fits
   nicely with existing frameworks that use event name for tracking
   EventTiming.  OK that it doesn’t map to a specific dom event.
   - … On the other hand it’d be fine to have more elaborate Scroll Timing
   information
   - *Yoav*: Do we see if there are cases where we’d have Scroll specific
   data? Would the data diverge from Events, and have Scroll/Event specific
   attributes? If so, would be better to start with separate entries.
   - *Noam H*: If they diverge we could keep both entries
   - *Yoav*: We could, but having that as a duplicate won’t be ideal.
   - *Nicolás*: May make sense to separate the entries.
   - … Question is what scroll updates would require, and that is harder to
   reason about as a single PerformanceEventTiming entry.
   - … We don’t want to expose all scroll updates, so we want some way to
   aggregate them. It’s not clear how, but they may require different values
   than EventTiming
   - *Noam H*: Updates are more about smoothness and not responsiveness to
   an interaction
   - *Nicolás*: Right. If we needed to have Scroll Update here then we
   should have a new entry.  If we wanted to have part of it in FrameTiming,
   then it could be OK to have the same entry for both. Not sure.
   - *Noam H*: Would frame timing include attribution to events
   - *Nicolás*: I don’t think so.
   - *Sean*: My understanding scrollBegin is the purple area in the diagram
   you have showing
   - *Nicolás*: It should be the same “next frame presented”
   - … For scrollBegin the diagram doesn’t make as much sense as if it
   doesn’t happen in the JS thread, then there shouldn’t be handlers (green
   area).
   - … On mobile, next frame after the touch move that triggered the scroll
   - *Sean*: The two possible scenarios are whether the JS thread/handlers
   would be involved?
   - *Nicolás*: Correct, in some cases the handlers block scrolling and in
   that case you need the green part
   - *Sean*: The diagram is bi-modal in that case
   - *Nicolás*: We could expose processingStart/End if it’s blocking
   scrolling
   - … If it’s not blocked, they’d both be 0
   - … That way developers would know it’s being blocked by handlers
   - *Peter*: Thanks for the presentation, quick question - working with
   real world cases after Web Vitals goes live.
   - … The point where we say work kicked off by event handlers, what I see
   in some websites is people skip Web Vitals from SEO, by defining loading of
   third-party snippets until after the user has just scrolled.
   - *Nicolás*: If it’s on the scroll event, it would probably not even be
   captured by the diagram -- async work out of scope of this
   - *Annie*: If you’re tracking all inputs, and the first input kicks off
   some work, that work should be captured by the later input events.


On Wed, Mar 3, 2021 at 1:58 PM Yoav Weiss <yoav@yoav.ws> wrote:

> Hey folks!
>
> Let's gather <https://meet.google.com/agz-fbji-spp?authuser=0&hs=122>
> tomorrow to talk about WebPerf!
> On the agenda
> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.ouoo9w726e2u>
> we have a short update on Resource Timing Fetch integration, a workerStart
> discussion <https://github.com/w3c/navigation-timing/issues/128>, an
> HR-Time PR <https://github.com/w3c/hr-time/pull/106/> as well as updates
> on Event Timing and Scroll Timing.
>
> As always, the call will be recorded and posted online.
>
> See y'all there!
>
> Cheers :)
> Yoav
>
Received on Tuesday, 9 March 2021 12:02:22 UTC