Re: WebPerfWG call - June 8th @ 10am PT

Minutes are now available:
    Linked to from our WebPerf WG Agenda document
<https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit#>
    Published to the web-performance Github meetings page
<https://w3c.github.io/web-performance/meetings/>
    ... and copied below:Participants

Mike Henniger, Lucas Kostka, Yoav Weiss, Nic Jansma, Omar Mohammad, Noam
Helfman, Yoav Moshe, Timo Tijhof, Michelle Vu, Yair Dovrat, Giacomo
Zecchini, Ian Clelland, Sean Feng, Alex Christensen, Sergey Moroz, Neil
Craig, Michal Mocny, Todd Gardner, Carine Bournez, Abhishek Ghosh, Philip
Tellis,
Admin

   - TPAC schedule


   - 11:30-13:00 and 2:30-4:30 on Monday and Friday
   - 2:30-6:30 on Tuesday and Thursday
   - Yoav: We've asked for 4 days, similar to last year
   - ... We're asking for 4 days of 3 hours each, and this time we got 4x4
   so we may have some time to spare on Friday, and we could make that time
   optional or have a webperf-related activity like last year


   - Next meeting: June 22 at 11am EST / 8am PST / 5pm CET

MinutesManaged Components

Recording
<https://www.google.com/url?q=https://youtu.be/kNIQKiHvBZw&sa=D&source=editors&ust=1686691101693457&usg=AOvVaw1s7bAMNwKWdRAi9DA226_T>


   - Yo'av Moshe: Engineer at CloudFlare, working on Zaraz
   - ... Working on Managed Components
   - ... Aim of making 3P tools fast and private
   - ... Want to convince you that the whole internet would be better if we
   started doing this
   - ... Today's issues with 3Ps
   -
   - ... Bad for performance, adding code
   - ... Bad for security, adding code from another server, has unlimited
   access, can hijack the page
   - ... Usually minified, website owner doesn't have an easy task
   understanding what's going on in it
   - ... Bad for privacy, 3P has IP address of your visitors, even if not
   intending to share
   - ... Zaras started by reverse-engineering tools
   - ... Created server-side implementation of this code, e.g. Google
   Analytics.  Take what it was getting from the browser and would create a
   server-side implementation
   - ... Code was faster, secure, private, audited
   - ... On average it improved website performance (Lighthouse scores) by
   40%
   - ... System wasn't just better by design, it's just that we wrote the
   code and our customers trusted it
   - ... Couldn't scale to all of the tools, vendors, etc
   - ... Wanted to have vendors write their own integrations.  Needed a way
   to make that secure
   - ... Why not just let everyone use it?
   - ... Want them to be empowered
   - ... Went to vendors asking if they could write integration, realized
   it would be much easier if they could write things beyond Zaras
   - ... So we designed this as Managed Components
   -
   -
   - ... Vendor is getting a lot of things for free
   - ... Unified events system, not FB + GA + TT events you have to listen
   to
   - ... Everything is sent on the same domain, so browser doesn't have to
   do all of those connections
   - ... Server logic can emulate things
   - ... Server-Side Render embeds, doesn't need to do Twitter Embed, would
   appear as 1P content within the page
   - ... Client-side events (mouse move) are provided by one listener
   working across different devices
   - ... Pre-page rendering actions- typically a 3P tool could only start
   acting once it's on the page.  With Pre-page rendering actions, the tool
   can learn about the request once it starts, and can manipulate the response
   - ... For analytics tools, this can be more accurate
   -
   - ... This component listens to a page view, and sends a fetch to the
   endpoint (analytics tool)
   - ... No client-side JavaScript code downloaded/evaluated/executed in
   the browser
   -
   -
   - ... Runtime environment (component manager) needs to dispatch events,
   e.g. mouse move
   - ... Provides server-side capabilities like proxying, routing, etc
   - ... Manages cookies and storage
   - ... Executes server-side wide code
   - ... Enforces the configuration the website desired
   -
   - ... CloudFlare Zaraz runs as Worker
   - ... Could run as a proxy, middleware, client-side ServiceWorker, etc
   - ... Slides up to now were from a previous call, we now have updates
   -
   - ... Open-sourced WebCM
   - ... Can run all of the components
   - ... Took everything in Zaraz library, converted to Managed Component
   and created a repo for it
   - ... We have vendor-created components (e.g. session recording tools)
   - ... Community-created components wrote things for tools they needed to
   use
   - ... For Cloudflare Zaraz, using on 20k+ websites, 10k+ GA4, 5k+ GA3,
   2k+ Facebook Pixel
   - ... Coming soon is to load own custom Managed Components, and embeds
   -
   - ... Much faster because it comes with the HTML of the page, doesn't
   expose IP address, etc
   - ... Lastly, about where we want to go, start a community group
   -
   - ... Goal to have a standard way to deploy fast and safe 3P tools
   - Yoav: We had invited him over to talk about all of this, aligned with
   things the WG is working on
   - ... They're starting a Community Group, seeing what connections we can
   make.  How this WG can help make this CG successful.
   - ... Seems like a worthwhile effort
   - Michal: Partially I say this in gest, but Chrome has extension
   functionalities. It seems adding scripts to pages was always a wild west
   that formed out of the way the web evolved.
   - ... I haven't looked a the specifics here, I think this is a
   worthwhile effort.  Seems like it's an extensions format that the developer
   of the webpage is choosing to enforce on the 3Ps it embeds, unrelated to
   the browser.  Do you think about it that way?
   - Yo'av: Not enforced by the browser, up to the website owner to have a
   safer way to embed 3P tech
   - ... When we designed it, we thought about how we could go about this
   right now without changing things
   - ... People who wanted to see changes were website owners
   - ... Demanded when developer security teams required no new JS files on
   the page, etc
   - ... What I like about this, the browser doesn't have to know about
   these things
   - ... Part of what makes it faster
   - ... Can think of this a webpage extension, I wouldn't want this
   executed by the browser by default.  Some APIs to run code in browser, but
   not first place it runs in.
   - Michal: Follow-up for server-vs-client
   - Alex: Mentioned that this system takes what used to be a script tag
   and makes it so that the client's IP address and user-agent are not exposed
   to that 3P.
   - ... Is script taken an inlined?
   - Yo'av: Not exactly. We reverse-engineered APIs and tools, and are
   executing code on the server.
   - Alex: At the end of the day, the client gets something that's like
   analytics and would contact analytics provider anyway?
   - Yo'av: For Google Analytics, the website loads and the Components
   Manager (Zaraz, WebCM) collects info on page (e.g. title), that it can't
   get from HTTP request itself.  Server-side code representing GA runs
   server-side, gets page title, resolution, etc, and sends the request to
   GA.  Browser never talked to google-analytics.com, nor ran the script,
   etc.
   - ... If you have 5 tools looking at page title, WebCM gets that once
   and spreads information to the specific tools
   - Michal: In one of the screenshots with statistics, are you saying it's
   a GA4-compatible script (and not GA4 itself)?  Different way to write
   scripts, a constrained API that will evolve over time.
   - Yo'av: Using GA through a managed component, no lines of code shared
   with the original script
   - ... Sometimes this is a source of incompatibility.  Hard to know
   always what the tool expects and not working as they should
   - Nic:


   - Question about the Twitter embed. Loading it client side enables to
   load it async, lazily, etc.
   - Is there a challenge there to prerender that embed to the client? Can
   it hurt the FCP, etc?
   - Multiple API calls to twitter can slow down the initial HTML


   - Yo’av: Can the 1P embed slow down the page more? I don’t have numbers
   for it, but it’s up to how the embed was created


   - For twitter there’s a caching layer that would avoid recreating the
   embed HTML over and over. I wouldn’t think there’d be a problem. But bad
   components can slow things down


   - Nic: Caching the results from 3P calls makes sense. Moving caching to
   the server.
   - Timo: Similar question - the tracking options would happen through
   separate beacon requests, collected once on the client and distributed on
   the server. How transparent would that be to the user and the browser?
   Proxying through the server makes it less obvious
   - Yo’av: It is less obvious. Zaraz has its own consent manager.
   - … The tech doesn’t require Managed Components. But it can happen
   - Timo: Does it only add endpoints? TODO
   - Yo’av: Can do both. Not everybody using cloudflare are proxying their
   entire domain.
   - … Our implementation of the embeds can replace the embeds, or load
   them in place
   - Sergey: Using Zaraz to integrate GA. Created a managed component,
   reading cookie info and passing it to GA
   - … Was a bit hard to debug. But it generally works very well
   - Yo’av: Long way to go in documentation and dev experience. Working on
   it
   - Neil: Would this work for analytics services on a website like ours
   (BBC) which is made up of multiple origins/products which are written in
   different server-side languages on a single domain? I am thinking in terms
   of session tracking etc. Would they be tied together somehow? Does CM allow
   for non-Node server-side code?
   - Yo’av: Doesn’t depend on what your backend looks like, so sit between
   your application and the browser. So no backend requirements.
   - Michal: followup to Sergey. we see examples where a 3P script hooks an
   eventhandler, delaying interactions. With Partytown  that removes that
   delay, but it’s not running the same script at the same moment, so the data
   about the page may be stale.
   - … So it works, but may deliver different data. So asking folks that
   adopt these technologies if they compared both options. Sergey - did you
   notice any changes? Or was it good enough?
   - Sergey: Not that much traffic, so didn’t do comparisons. Depends on
   what you do. Some sites are more timing sensitive than others.
   - Michal: Yo’av - Have y’all looked at correctness, e.g. of page state
   snapshots?
   - Yo’av: we haven’t. Optimizely and session reporting are the harder
   tools to report. We’d have answers soon.
   - … Works well for analytics, conversion pixels, etc
   - … yet to be seen about more complex
   - Michal: Seen 3P providers interested in 100% reliability, you want
   perf and security, and the developers are stuck in the middle
   - Yo’av: Want to figure all that out together
   - Sergey: CF server tries to run some stuff out of the browser and run
   it on the server side. Up to developer and biz owner if this is good
   enough, or if information correctness is critical and worth the performance
   cost
   - Jake Casto: Zaraz user, written managed components. Seen incredibly
   consistent data. With Partytown we had severe issues - inconsistent data,
   people refreshed pages losing data, etc
   - … Felt much better about the data, seeing exactly what’s collected
   - … Consistency issues were between the component and the component
   manager
   - … No issues with data delays. Same delays we’ve seen with client code
   - … Seeing the data being collected and send gave us a lot of peace of
   mind
   - Timo: What would a browser-side approach look like?
   - ... A way to declare on the page what information should be extracted
   and sent where. Akin to WebKit's PCM experiment. If the bulk of these can
   be standardized in a generalized way, that would allow them to perform much
   better through native means (off the mean thread perhaps even, like most
   CSS work, and indeed PCM).
   - ... Browsers could then carry the responsibility to reliably deliver
   this to specified destinations, including the choice to use a
   vendor-provided proxy for privacy. Examples: Most browsers already provide
   a vendor-provided TURN server for WebRPC connections, there is DoH
   (DNS-over-HTTPS) with Mozilla, Apple proxying tracking requests through
   their edge servers, etc. Win for perf, win for ease of data collection, and
   win for privacy/transparency.
   - … proxying dedicated beacons
   -
   https://webkit.org/blog/11529/introducing-private-click-measurement-pcm/
   <https://www.google.com/url?q=https://webkit.org/blog/11529/introducing-private-click-measurement-pcm/&sa=D&source=editors&ust=1686691101703519&usg=AOvVaw1NTxYPzhbCLSsmxoj9qqIi>
   - Yo’av: supporting client-side behaviors are the more challenging part.
   How much should we be tied to web APIs and how much to reinvent.
   - Michal: A few alternatives to the server-side part, amongst them
   Service Workers. But SW are not always interoperable. Could a managed
   component be implemented in  web workers or would proxying be prohibitive?
   - … Clients that want to embed 3Ps, could they use a subset of the API?
   - Yo’av: most components are not using the server API and would be easy
   to create a client-side version of them
   - Michal: So maybe we could have 2 versions per tool - client and server
   - Yo’av: We have a permission system. Can enable client-side key/value.
   Network requests client and server side. Can tie these permissions to the
   capabilities
   - Can say that server side code can’t run without its permissions
   - Nic: How would you like to continue gather feedback?
   - Yo’av: Would love to continue this conversation. I can follow up with
   an email for a meeting dedicated to that.
   - Yoav: Is the Community Group already setup?
   - Yo'av: Not yet, we wanted to see what the interest was
   - Michal: TPAC collaboration?
   - Yoav: Folks interested in joining that CG, we can connect and check
   for interest
   - Nic: Can broadcast
   - Philip (in chat): curious how that would work with cache-control:
   s-maxage vs maxage, proxy-revalidate, private, etc.
   - Yo’av: Cache control headers, trying to run it through proxying.
   Always leaning towards reuse of existing tech and respecting caching headers

- Nichttp://nicj.net/
@NicJ



On Wed, Jun 7, 2023 at 4:16 PM Yoav Weiss <yoavweiss@google.com> wrote:

> Hey folks,
>
> We have an exciting WebPerf meeting for y'all tomorrow!
> On the agenda
> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.56vqzgjka8q2> we
> have a discussion on Managed Components <https://managedcomponents.dev/>,
> and how the WG can potentially help with the effort to move more
> third-party work outside of the browser. If there's time left, we'll talk
> about timeOrigin as well as NEL and Reporting issues.
>
> See y'all there <https://meet.google.com/agz-fbji-spp?authuser=0&hs=122>!!
>
> Cheers :)
> Yoav
>

Received on Tuesday, 13 June 2023 20:23:20 UTC