Re: WebPerfWG call - Jan 19th @ 10am PT / 1pm ET / 7pm CT from Nic Jansma on 2023-01-20 (public-web-perf@w3.org from January 2023)

From: Nic Jansma <nic@nicj.net>
Date: Fri, 20 Jan 2023 12:05:13 -0500
To: public-web-perf <public-web-perf@w3.org>
Message-ID: <0c672e33-fc8c-47b4-cf8a-e966748cf889@nicj.net>
Minutes are now available:

     Linked to from our WebPerf WG Agenda document 
<https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit#>
     Published to the web-performance Github meetings page 
<https://w3c.github.io/web-performance/meetings/>
     ... and copied below:


  Participants

Michal Mocny, Sia Karamalegos, Pat Meenan, Dan Shappir, Ian Clelland, 
Barry Pollard, Alex N. Jose, Annie Sullivan, Katie Sylor Miller, 
Michelle Vu, Nic Jansma, Yoav Weiss, Lucas Pardue, Steven Bougon, Carine 
Bournaz, Hao Liu, Lan Wei, Abin Paul, Rafael Lebre


  Minutes

  * Next meeting in 2 weeks, Feb 2nd, 8am PST/11am EST
  * Need to work through our charter

  * It expires in mid-February. Discussed at TPAC but didn’t yet
    incorporate feedback
  * Asked for a charter extension
  * Will have a draft ready for the group in the next few week
  * Aiming for 2 weeks from now with a proposal

  * https://github.com/w3c/resource-timing/issues/340
    <https://www.google.com/url?q=https://github.com/w3c/resource-timing/issues/340&sa=D&source=editors&ust=1674237706312053&usg=AOvVaw2eaRmM2LhVOYjYXuneAaZG>

  * Changes will be coming to Chrome on how IFRAME data is reported


    LCP and image entropy - Ian Clelland

Recording 
<https://www.google.com/url?q=https://youtu.be/e4KUnJygqCo&sa=D&source=editors&ust=1674237706312598&usg=AOvVaw1GUT6Lo2qaZ4f7BwUu-S5X>

  * Ian: Update on experiments we’re doing in Chrome to analyze image
    entropy and potentially exclude images from LCP
  * … Long standing issue #86 where LCP is not necessarily contentful
  * … We discount full viewport images, background images that are solid
    color
  * … But there are cases that we’re not catching and we want to do
    better to align with UX
  *
  * … The LCP is completely transparent (in green) but that second point
    should be marked as LCP as that’s the actual content
  * … The idea we have is to look at the information density of the
    image, but we wanted to go with something that’s as simple as
    possible: bpp
  *
  * … Bits - resource transfer size
  * … Pixels - CSS pixels. Taking into account device size but not
    sensitive to DPR
  * … Distribution

  * HTTPArchive
  * Chrome

  * … Looked at a number of sites whose LCP images ended up in the
    bucket at the low … end.
  * … Found a lot of transparent SVGs that are invisible to the user
  * … Beyond that, small placeholders or single color backgrounds
  * … At about 2 bits per pixel is the sign where images start to look
    like content
  *
  * … Experimenting with 3 cut points in Chrome beta
  *
  * … We’re trying to get what would be the impact of dropping images
    below each cut point on overall LCP
  * Dan: Are you paying special attention to the alpha channel or
    opacity with CSS?
  * Ian: Not at the moment. There could be images that are transparent
    through the alpha channel. Opacity 0 is discarded.
  * Dan: Thinking of images with content but high levels of transparency
    that can imply they are not content but background.
  * Ian: Good point.
  * Dan: I’ve seen that at Wix where people used images with high
    opacity to create a background effect. It wasn’t really content
  * Ian: Something we should address in terms of redefining LCP
  * Michal: Dan, that seems like the opposite problem. The image would
    be slow to load but won’t be meaningful.
  * … If you have really high BPP with low visibility - we can consider
    it LCP, but we should warn developers that they wasted resources
  * Pat: Lots of tooling that does that already. Can already be a result
    of a ton of metadata in the image itself.
  * Ian: We created a permission policy for that purpose - to flag these
    sort of things
  * Dan: The “ignoring the full screen” heuristic is also not perfect,
    e.g. if a few pixels are missing these images are counted as LCP
  * Ian: Hoping that these cases would be caught by that
  * … The idea wasn’t just to penalize sites that do that. There are
    cases where a large background image is loaded later, and it counts
    as the LCP after the actual LCP
  * Katie: Seems like the transparent BG image is someone trying to hack
    a better LCP score. What %age of sites are doing that? Not developer
    error but deliberate. How rampant is that?
  * Annie: It’s a small percent that does this. Estella wrote a great
    blog post on lazy loading in general. Unclear if it’s a hack but
    something people do to try and improve their user experience. Want
    to clarify this is not a moment to optimize for.
  * Barry: Ran an analysis in WebAlmanac and it was rare enough
  * Ian: We can see this in the distribution. We don’t see a spike at
    the low end, but a small blip
  * Sia: We definitely see it at Shopify. Apps sell hacks like this
    (against our terms), but it could be increasing or preying on people
    that don’t understand what they are doing. Wondering if it’s growing
  * Ian: There’s a chance that if we put a threshold, people will go
    just above it. But it would make it clear they are doing something shady
  * Annie: And a LH audit can warn against it
  * Pat: On the spec side of things, where do we draw the line between
    evolving LCP and playing whack-a-mole? How much of this will make it
    into the spec?
  * Ian: I’d love to get browser buy in around this. Making it simple
    should help. Hoping this is standardizable.
  * Carine: Wondering if the question was around difficulty of
    implementation or runtime performance?
  * Pat: We expect different browsers to implement LCP. Do we include
    all the heuristics in the spec, or should we leave it up to
    individual browsers?
  * Carine: No compat question so we can increase complexity over time
  * Pat: Browser developers may get frustrated with multiple tweaks over
    time. Would it become stable or should it be a living standard? Will
    we find a point where we’re happy? Right now it’s a bit loose and
    we’re iterating over edge cases
  * Carine: at some point we may have to fork the metric definition
  * Yoav: From side conversations I’ve had with other implementers, they
    want to properly define the heuristics
  * … Will we reach a point where we’re happy?  I hope so.  It’s what
    we’re aiming for
  * … Generally I think we’ll have a finite number of heuristics that
    we’ll apply here
  * Michal: On the high-end of the filter you applied, there were some
    images that were borderline-contenful (i.e. icons)
  * … When you did that analysis were you looking at the final LCP for
    the page or all resources?
  * Ian: Final LCP candidate
  * Michal: I could see how you’ll begin to exclude icons that are
    questionably LCP, but if it doesn’t end up the overall candidate
    then it doesn’t matter so much
  * … You could be more aggressive with your filtering
  * … But if these are final LCP candidates is it from other scenarios
  * Ian: There were a couple like that, and like the speech bubble
  * Annie: I looked at dozens^3 of these, it was a placeholder.  Very
    non-CSS-aware and low content sites, background:red.png instead of
    background-color:red
  * … Rest of it was text, they ended up with text LCPs and were improved
  * … A lot of those were desktop sites being shown on mobile
  * … 10% were these things that were images, but it’s questionable if
    it’s content
  * … Lowest BPP thing was a QR-code, that should be a LCP element


    Resource Timing and Basic Auth - Nic Jansma

  * Nic: Detected that RUM providers may be inadvertently collective
    sensitive information as part of RT
  * … Basic auth is in your URL as `user:pwd@`
  * … Demo site that hardcodes an auth URL, RT entry includes that
    information in the URL itself
  *
  * … That’s not just for XHR
  * … Links that have auth hardcoded into them - all of their
    subresources (as a relative URL) also include those auth
    credentials, scripts, images, the lot
  * … If you are not being super careful about that, you may end up
    scooping up this information in your RUM data
  * … Could happen to RUM providers or on personal sites, e.g. for test
    accounts
  * … We talked about this in 2015 and WONTFIXed it
  * … The logic was that this is a known anti-pattern, and it’s your own
    choice if you’re doing that
  * … However, in a world where third parties have access to this data,
    it is easy to unintentionally capture this data
  * … So wanted to ask if we wanted to strip that information from the
    URLs that the API record
  * … Or even more strictly, exclude these resources from RT
  * … In the DOM, location.href omits that data, but document.URL does
    include them
  * … So the URLs are web exposed
  * … The URL in the address bar hides that info as well
  * … I’m sure there’s some history RE the difference
  * … I couldn’t come up with examples on why we’d want to include this
    data in RT
  * Sia: Meta question - how do y’all make decisions?
  * Nic: Not asking for a decision but a discussion. We’d reopen the
    issue, form consensus there, etc
  * … Chrome and Firefox report the credentials, Safari doesn’t
  * Sia: Prefer to split out the credentials from the URL but keep it
  * Nic: Same
  * Ian: Surprised you can see that through the DOM. There a bigger
    problem with capability URLs in general
  * … Even if we strip those credentials, the public capability URLs
  * Nic: Some customers include sensitive info in query params and we
    give them tools to strip them
  * … We’re striping this info regardless of browsers, but personally I
    found it surprising
  * Benjamin: What’s the current status of Fetch with this? Did it go
    away with Fetch revisions since 2015?
  * Nic: Not sure anyone on the call knows
  * … We’ve seen this with navigation and XHRs, not sure about fetch()
  * Ian: The issue says that `fetch()` would solve it, not the Fetch spec
  * Barry: I don’t think we should report it. Should we strip it out and
    pretend it was never there, or should we make it obvious that it was
    there.
  * Ian: If you’re using basic auth on your site, that pwd will be in
    every request
  * Nic: If you type out basic auth in the box, that doesn’t happen.
    Only for links that have basic auth in them
  * Barry: I’d hope it’s so infrequent that we don’t need to do anything
    special for it, e.g. masking it
  * Nic: It’s frequent enough
  * Sia: No answer, if we got rid of it, is there a valid use case where
    someone needs it? Otherwise, we should strip it.
  * Nic: can’t think of one
  * Sia: can use document.url to get it if they need it
  * Yoav: I similarly we should just strip it, I don’t see a reason why
    we shouldn’t
  * … I don’t think it would be very hard to convince or get consensus
  * … One point regarding what Barry said, I don’t think it’s the role
    of ResourceTiming to educate about Basic Auth.  Ideally we could
    deprecate this entirely and send deprecation reports, otherwise some
    other reporting that would do this issue.
  * … We have other channels for that reporting
  * Barry: If it’s a little extra effort to mask it, that’s fine.
      Stripping is OK.
  * Yoav: Frankly this sounds like something browser security teams
    should eventually deprecate
  * Michal: Is the plan to strip, is that just for password or username too?
  * … Many sites use username as part of URL path
  * … Some deprecation notices around this, there’s clear advice to
    strip password
  * … But it doesn’t suggest stripping the username
  * … In terms of providing auth, there may be valid use-cases
  * Patrick: Fetch strips username and password for reporting purposes
  * Ian: One legitimate use case when protecting for basic auth, for API
    calls
  * Nic: I haven’t seen examples of that, but I will follow-up to see if
    we see that in our data
  * Yoav: Talking about a legitimate use case for API call, but not for
    reporting, right?
  * Ian: Yes, still strip it out (even tho we’re not exposing user’s
    username, but API name)
  * Yoav: Michal if you have examples of past discussions regarding
    this, once we re-open issue if you could paste it there
  * … Difference would be that the username here can be collected at scale
  * Nic: Issue filed:https://github.com/w3c/resource-timing/issues/368
    <https://www.google.com/url?q=https://github.com/w3c/resource-timing/issues/368&sa=D&source=editors&ust=1674237706322255&usg=AOvVaw36frJMZcb3vYhvUo17Vczk>


- Nic
https://nicj.net/
@NicJ

On 1/18/2023 10:22 AM, Nic Jansma wrote:
> Hi everyone!
>
> On the agenda 
> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.osvewfb7hvdz> 
> for our next call (Jan 19th @ 10am PT / 1pm ET / 7pm CT) we will discuss:
>
>   * Charter extension
>   * LCP and image entropy
>   * RT issues (340
>     <https://github.com/w3c/resource-timing/issues/340>, 7
>     <https://github.com/w3c/resource-timing/issues/7>)
>
> *<https://github.com/w3c/resource-timing/issues/304>* If you have 
> additional items, please add them to the agenda 
> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.osvewfb7hvdz>.
>
> Join us <https://meet.google.com/agz-fbji-spp>!
> - Nic https://nicj.net/ @NicJ
Received on Friday, 20 January 2023 17:05:30 UTC