Re: Specifying the calculation of INP from Event Timing entries from Barry Pollard on 2025-02-06 (public-rumcg@w3.org from February 2025)

From: Barry Pollard <barrypollard@google.com>
Date: Thu, 6 Feb 2025 14:12:15 +0000
To: Andy Davies <andy.davies@speedcurve.com>
Cc: public-rumcg@w3.org
Message-ID: <CAH6JyLQJ8BiVRKLs66+7S4ZCgw3U=98=5J+jFD7PPEZm6s=kfA@mail.gmail.com>
>
>
>    - *Given a stream of Event TIming entries how is INP calculated?*
>
> This is conceptually quite simple:

   - Look at every Event Timing where interactionId is non-zero.
   - The longest duration across all those events for your page is your INP
   for that page.

But of course there are caveats:

   - In reality you'll have to limit this to durations >= 16ms (minimum
   duration for event timing) and probably want to go higher than that
   (web-vitals.js uses 40ms by default).
   - You may want to limit this to p98 of the events on the page to avoid
   noise of long lived pages with very occasional glitches. Calculating p98
   depends on looking at the number of eligible interactions (not NOT the
   number of eligible event handlers). There is an interactionCount properly
   in Chrome but this is still behind a flag. You can look at the web-vitals
   polyfill
   <https://github.com/GoogleChrome/web-vitals/blob/main/src/lib/polyfills/interactionCountPolyfill.ts>.
   Or you can choose to ignore this and report everything in RUM, safe in the
   knowledge that you'll be at least as good as Google measures.
   - You'll likely want to report a percentile/percentiles across pages (we
   recommend p75 as the measure, but other percentiles can be helpful for
   investigation or warning signs).
   - You may want to include first-input events to allow you to still
   report an INP for pages where all events are <= 16 ms (or whatever
   threshold you set). Many low INPs can dramatically affect your overall site
   numbers, so excluding those will make your INP look worse than it actually
   is.
   - You may want to reset on bfcache restores (CrUX considers this a new
   "page view").
   - Usual caveats about WHEN you report this (and how reliable that can be
   depending on when in the page lifecycle you choose to report).


>    - *Given a stream of Event Timing entries how are the phases of INP
>    calculated?*
>
> This annoyingly is much more complicated as events can come out of order.
Especially so if you also want to report intersecting LoAFs as well! We're
doing some clean up work to make this a little easier in future but the
code in web-vital.js is the best way of doing this at present as far as we
can tell.

At a high-level, you'll need to gather all events, group them based on the
same (or similar) render time. Discard any groups without any interaction
ids, then calculate the 3 subparts for the group as you described
previously:

- Input Delay is the earliest startTime to the earliest processingStart
> across all the Event Timing entries in the frame
> - Processing Time is from the earliest processingStart to the latest
> processingEnd across all the Event Timing entries in the frame
> - Presentation Delay is from the latest processingEnd to the latest
> endTime across all the Event Timing entries in the frame


Where endTime = startTime + duration.

We should be in the position of being able to point someone, who has no
> idea what an interaction or  the main thread is, at a series of steps and
> have them successfully write code to calculate INP etc and that's not
> possible as things stand


That seems an unrealistic goal. Especially since the concept of an
interaction (and it's like to potentially many event handlers) is so
integral to the concept of *Interaction* to Next Paint.

For those who don't want to think about this level of detail, we offer the
web-vitals.js library to measure it in the most accurate way we think is
possible. And open source that code for implementers to also learn from.


On Thu, 6 Feb 2025 at 13:29, Andy Davies <andy.davies@speedcurve.com> wrote:

> Let me rephrase the questions:
>
>    - Given a stream of Event TIming entries how is INP calculated?
>
>
>    - Given a stream of Event Timing entries how are the phases of INP
>    calculated?
>
>
> These are the questions that I think need clearly and concisely documented
> answers.
>
> To me the web.dev / MDN articles are focused on the concept, not the
> mechanics of the calculation and they only mention EventTiming in passing.
>
> And that's OK for those articles
>
> But from an implementer perspective I want something that's concise,
> explicitly references Event Timing (and it's properties) and includes the
> steps need to calculate the output without  the cognitive load of
> translating the conceptual to the concrete
>
> We should be in the position of being able to point someone, who has no
> idea what an interaction or  the main thread is, at a series of steps and
> have them successfully write code to calculate INP etc and that's not
> possible as things stand
>
>
>
>
> On Wed, Feb 5, 2025 at 6:44 PM Barry Pollard <barrypollard@google.com>
> wrote:
>
>> Choosing the longest duration Event Timing entry that has an interaction
>>> id, *and where there is more than one the same choosing the one with
>>> the longest duration*, and where there are many events choose the entry
>>> with the 98the percentile longest duration
>>
>>
>> That INP definition is correct, though I'm not sure you need the middle
>> clause (that I've highlighted in italics as that's covered by the previous
>> quote)?
>>
>> It is covered in web.dev/inp <https://web.dev/articles/inp#what-is-inp>
>> in these statements:
>>
>> An interaction's latency consists of the single longest duration
>>> <https://w3c.github.io/event-timing/#ref-for-dom-performanceentry-duration%E2%91%A1:%7E:text=The%20Event%20Timing%20API%20exposes%20a%20duration%20value%2C%20which%20is%20meant%20to%20be%20the%20time%20from%20when%20user%20interaction%20occurs%20(estimated%20via%20the%20Event%27s%20timeStamp)%20to%20the%20next%20time%20the%20rendering%20of%20the%20Event%27s%20relevant%20global%20object%27s%20associated%20Document%E2%80%99s%20is%20updated> of
>>> a group of event handlers that drive the interaction, from the time the
>>> user begins the interaction to the moment the browser is next able to paint
>>> a frame.
>>
>>
>> and this piece (hidden behind a "Details of how INP is calculated"
>> summary heading:
>>
>> To give a better measure of the actual responsiveness for pages with a
>>> high number of interactions, we ignore one highest interaction for every 50
>>> interactions. The vast majority of page experiences don't have over 50
>>> interactions, so the worst interaction is most often reported. The 75th
>>> percentile of all page views is then reported as usual, which further
>>> removes outliers to give a value that the vast majority of users experience
>>> or better.
>>
>>
>> (1/50 is the same as 98th percentile).
>>
>> and finally we advise measuring the 75th percentile across all page views:
>>
>> To ensure you're delivering user experiences with good responsiveness, a
>>> good threshold to measure is the 75th percentile of page loads recorded
>>> in the field, segmented across mobile and desktop devices
>>
>>
>> This is also covered in MDN
>> <https://developer.mozilla.org/en-US/docs/Glossary/Interaction_to_next_paint>
>> :
>>
>> INP measures the worst length of time (minus some outliers), in
>>> milliseconds, between the user interaction on a web page and the next frame
>>> presentation after that interaction is processed. Scrolling and zooming are
>>> not included in this metric. INP is calculated using the Event Timing
>>> API
>>> <https://developer.mozilla.org/en-US/docs/Web/API/PerformanceEventTiming>.
>>> Asynchronous operations such as network fetches or file reads usually do
>>> not delay INP as painting can occur while such operations are handled.
>>
>>
>>> All eligible interactions throughout the page lifetime are considered.
>>> For highly interactive pages of 50 or more interactions, the 98th
>>> percentile is used to exclude some extreme outliers that are not reflective
>>> of overall page responsiveness.
>>
>>
>> And for the more technical people that wanna see how it's actually
>> calculated in code, rather than prose, we have the web-vitals.js reference
>> implementation as you note.
>>
>> So I'm not sure what more documentation we need to do here on INP?
>>
>> Where it gets more complicated is that Chrome and web-vitals.js are
>>> coalescing Event Timing entries which presented in the same frame and so
>>> producing different definitions of the INP phases
>>
>>
>> Subparts (FYI, we in Google are trying to unify on "subparts" over
>> "phases") are a separate thing meant to help point developers in the right
>> direction of where to concentrate to resolve INP issues.
>>
>> Which seems to use the following approach:
>>
>>
>>> - Input Delay is the earliest startTime to the earliest processingStart
>>> across all the Event Timing entries in the frame
>>> - Processing Time is from the earliest processingStart to the latest
>>> processingEnd across all the Event Timing entries in the frame
>>> - Presentation Delay is from the latest processingEnd to the latest
>>> endTime across all the Event Timing entries in the frame
>>
>>
>> Again this is correct (there the same frame is defined as ending at the
>> same time taking into account coarsening and rounding issues —
>> web-vitals.js uses "within 8ms"). This matches the definition that has been
>> on web.dev/inp diagrams since the beginning:
>>
>> [image: image.png]
>>
>> and, for interactions that span multiple frames this diagram further down:
>>
>> [image: image.png]
>>
>> This approach leads to some event handlers being classified in Input
>>> Delay for some interactions, and at other times exactly the same event
>>> handler being classified in Processing Time depending on whether the event
>>> handlers all run in the same presentation frame
>>
>>
>> As discussed offline yesterday, the only place this should happen is in
>> DevTools which has a somewhat naive interpretation of subparts (that we are
>> in the process of updating). In effect that's a bug/limitation of the
>> current DevTools implementation. If you have examples where web-vitals.js
>> is doing this incorrectly then please let me know.
>>
>> I'm clear why I don't want to use the Chrome / web-vitals.js definition
>>> but as it's not documented I suspect others might not even know what the
>>> approach is or how it may affect them
>>
>>
>> That is entirely your prerogative, however in the interests of
>> "standardisation" as per this e-mail thread it would be good to understand
>> your concerns and see if we can agree on something. Or, if not, then
>> perhaps avoid the use of the same "subpart names" (input delay, processing
>> duration, and presentation delay) with different meanings.
>>
>> An "interaction" may be made of multiple event handlers (some of which
>> may be added by your code, some by third-party code) but they are all fired
>> based off of the same "user interaction". Treating event handlers
>> separately is one technical way of looking at it, and does allow you to
>> concentrate on that code if it's the one at issue, but risks losing the
>> total duration visibility of code directly linked to that interaction.
>>
>> For example, at a high level, I advise:
>>
>>    - High Input Delay: Nothing to do with *this* interaction's code.
>>    Look at other code or general business of the page.
>>    - High Processing Duration: Look at *this* interaction. Try to repeat
>>    in the lab and fix.
>>    - High Presentation Delay: Likely look at *this* interaction. Try to
>>    repeat in the lab and fix.
>>
>> Looking at event handlers individually narrows that Processing Duration
>> phase meaning you may assume it's not this interaction more often (when it
>> actually is).
>>
>> But open to hearing your concerns and alternative approach!
>>
>> On Wed, 5 Feb 2025 at 17:59, Andy Davies <andy.davies@speedcurve.com>
>> wrote:
>>
>>> I don't think it needs to be a spec in the same sense of the W3C terms
>>> but I do think it needs to be documented.
>>>
>>> As far as I'm aware INP is calculated by:
>>>
>>> Choosing the longest duration Event Timing entry that has an interaction
>>> id, and where there is more than one the same choosing the one with the
>>> longest duration, and where there are many events choose the entry with the
>>> 98the percentile longest duration
>>>
>>> And the Input Delay, Processing Time & Presentation Delay phases are
>>> defined from the Event Timing entry that was used for INP
>>>
>>> But when I went searching for definitions I couldn't find one anywhere
>>> (other than reading web-vitals.js code and that's open to error)
>>>
>>>
>>> Where it gets more complicated is that Chrome and web-vitals.js are
>>> coalescing Event Timing entries which presented in the same frame and so
>>> producing different definitions of the INP phases
>>>
>>> Which seems to use the following approach:
>>>
>>> - Input Delay is the earliest startTime to the earliest processingStart
>>> across all the Event Timing entries in the frame
>>> - Processing Time is from the earliest processingStart to the latest
>>> processingEnd across all the Event Timing entries in the frame
>>> - Presentation Delay is from the latest processingEnd to the latest
>>> endTime across all the Event Timing entries in the frame
>>>
>>> This approach leads to some event handlers being classified in Input
>>> Delay for some interactions, and at other times exactly the same event
>>> handler being classified in Processing Time depending on whether the event
>>> handlers all run in the same presentation frame
>>>
>>> I'm clear why I don't want to use the Chrome / web-vitals.js definition
>>> but as it's not documented I suspect others might not even know what the
>>> approach is or how it may affect them
>>>
>>> And documenting it would save others going through the process I went
>>> through and save Barry from all the questions I asked him
>>>
>>>
>>>
>>> On Tue, Feb 4, 2025 at 6:09 PM Michal Mocny <mmocny@google.com> wrote:
>>>
>>>> It's an interesting question.  The web platform specifications define
>>>> the public web platform features: Event Timings (for INP), Layout
>>>> Instability (for CLS).  Largest Contentful Paint the web feature is more
>>>> closely matching LCP the metric (though even there, there are gaps, like
>>>> merging iframes).
>>>>
>>>> I think there is not typically a "spec" for INP/CLS other than the
>>>> specific conventions that the Core Web Vitals program uses, and those are
>>>> defined on sites like web.dev (as you linked) and reference
>>>> implemented on web-vitals.js.  Other RUM providers I think have always
>>>> deviated in small ways (i.e. loading-only vs post-load, etc) and might
>>>> always need to deviate.
>>>>
>>>> That said: we do have a note in the non-normative section
>>>> <https://wicg.github.io/layout-instability/#cumulative-layout-shift>
>>>> of layout instability spec for DCLS and CLS, with a usage example -- though
>>>> that description isn't up to date with the latest CLS CWV metric...
>>>>
>>>> I would be happy if someone wanted to add documentation to all the
>>>> non-normative sections of all the specs to define these, but I'm not also
>>>> not sure about it.
>>>>
>>>> On Tue, Feb 4, 2025 at 12:13 PM Andy Davies <andy.davies@speedcurve.com>
>>>> wrote:
>>>>
>>>>> https://web.dev/articles/inp contains an abstract overview of what
>>>>> INP represents but…
>>>>>
>>>>> There's no actual specification for how INP it should be calculated
>>>>> from Event Timing entries.
>>>>>
>>>>> Don't know whether this should exist within this CG or be raised in
>>>>> the WebPerf WG but as it's a 'standard' metric then I think it should be
>>>>> documented
>>>>>
>>>>> Thanks
>>>>>
>>>>> Andy
>>>>>
>>>>> --
>>>>>
>>>>> Andy Davies
>>>>> Web Performance Consultant, SpeedCurve
>>>>>
>>>>>
>>>>>
Attachments

image/png attachment: image.png
image/png attachment: 02-image.png
Received on Thursday, 6 February 2025 14:12:35 UTC