Re: WebPerfWG design call - July 10th @ 10am PST from Yoav Weiss on 2019-07-11 (public-web-perf@w3.org from July 2019)

From: Yoav Weiss <yoav@yoav.ws>
Date: Thu, 11 Jul 2019 19:24:48 +0200
To: public-web-perf <public-web-perf@w3.org>
Cc: Ilya Grigorik <igrigorik@google.com>, Todd Reifsteck <toddreif@microsoft.com>, Philippe Le Hegaret <plh@w3.org>
Message-ID: <CACj=BEgzvoWqYqExjT+9GN79GuVHoR2E15JjWEkSXDMS_J6wNg@mail.gmail.com>

Heya,

Minutes from yesterday's call are now available
<https://docs.google.com/document/d/e/2PACX-1vQilbA9sQwFf4j2-VSh81m_8EkPuMlj_b7mEC4Y8NVqaVpja2mP8k1J3z79p4jrzI07VYgMRcVhXxDN/pub>,
along with the video <https://www.youtube.com/watch?v=qQnZT7SceIk>.

Copying the minutes here for safe keeping.

Cheers :)
Yoav

WebPerfWG design call - July 10th 2019
Participants

Alex Christensen, Maxime Villancourt, Yoav Weiss, Todd Reifsteck, Benjamin
De Kosnik, Andrew Comminos, Will Hawkins, Ryosuke Niwa, Phil Walton, Tim
Dresser, Nicolás Peña, Nic Jansma, Eric Lawrence
Admin

Next meeting: Aug 1st 11am PST
Gargantua, the Working Group dashboard project - Philippe

- Demo: https://www.w3.org/2019/07/webperf-example.html
<https://www.google.com/url?q=https://www.w3.org/2019/07/webperf-example.html&sa=D&ust=1562869258447000>
- Philippe

- We have a lot of data, but you have to know where to look
- Gargantua is about exposing this data
- Needs

- Public page for the rest of the world to see what we’re working on
- A page that’s useful for us: GH issues, spec status
- For the chairs - with more details

- Underlying data:
https://www.w3.org/PM/Groups/browse.html?gid=45211&query=active-specifications
<https://www.google.com/url?q=https://www.w3.org/PM/Groups/browse.html?gid%3D45211%26query%3Dactive-specifications&sa=D&ust=1562869258448000>
- Also links to wpt.fyi, but ad-hoc way to link directory to a spec.
Trying to make it clearer
- Done on the client, so risks DoSing the related servers

- Plan to encourage folks at the Hackathon to optimize this

- We can play with it with an API key. W3C accounts can also create
their own API key
- Continues to iterate over this, please surface data needs
- Nicolás: way to see issues?
- Philippe: yes, with a 24H delay, because we crawl GH every night, to
avoid hitting limits
- Todd: Wanted to show numbers and link to GH, rather than displaying
everything
- Yoav: possible to generate on the server?
- Philippe: yeah, talked to our system folks
- Todd: Looks really good. Only question is how fast can we have it?
- Philippe: Code is on GH: https://github.com/w3c/gargantua
<https://www.google.com/url?q=https://github.com/w3c/gargantua&sa=D&ust=1562869258449000>

- Created own framework, because it’s lazy loading the data, in order to
avoid overwhelming the browser

- Next steps - Hackathon before TPAC

- All the tool folks will be there

Upload Compression
<https://www.google.com/url?q=https://docs.google.com/presentation/d/1sJns8zk2J5mRWqYtWe5EynOhptcmHqF25TlwSWq-0aQ/edit?usp%3Dsharing&sa=D&ust=1562869258450000>
-
Andrew

- Andrew: Idea informally floated - talked about adding gzip to JS
profiling

- Folks suggested breaking out the compression format
- An interface for compression algorithms: gzip, brotli, zstd
- Use-cases for complex apps as well as analytics vendors

- FB saw large reliability improvements from client-side compression

- Why not a polyfill?

- WASM was too large and not as fast as native. Requires data copying
- Also UAs already ship this, so why not

- Requirements

- Async
- Uniform interface, not codec specific
- Avoid extra allocations and copies
- Throttle for low CPU devices - in control of the developer

- Ben: Way to query what compression algorithms are available?
- Andrew: yeah should be feature detectable. getCompressorType or
something like that makes sense.
- Andrew: Proposed API includes entry points through Streams and
ArrayBuffer. To avoid extra allocations, the ArrayBuffer input transfers
outside the script’s control, so we could reuse the same input allocation,
which is useful regardless of codec.
- Philippe: Are you aware of past proposals on WICG?
- Andrew: didn’t know but will look into them.

- Past discussions:

- https://discourse.wicg.io/t/a-zip-api-in-the-browser/14
<https://www.google.com/url?q=https://discourse.wicg.io/t/a-zip-api-in-the-browser/14&sa=D&ust=1562869258452000>
- https://discourse.wicg.io/t/need-for-gzip-deflate-api/1237
<https://www.google.com/url?q=https://discourse.wicg.io/t/need-for-gzip-deflate-api/1237&sa=D&ust=1562869258452000>
-
https://discourse.wicg.io/t/compression-for-form-submission-request-bodies-from-the-browser/2838
<https://www.google.com/url?q=https://discourse.wicg.io/t/compression-for-form-submission-request-bodies-from-the-browser/2838&sa=D&ust=1562869258453000>
-
https://discourse.wicg.io/t/interest-and-use-cases-for-transcoding-streams/3027
<https://www.google.com/url?q=https://discourse.wicg.io/t/interest-and-use-cases-for-transcoding-streams/3027&sa=D&ust=1562869258453000>

- Philippe: also, did you think about video compression as well?
- Andrew: We could… not familiar with WebRTC, but this could be
leveraged for something like that
- Todd: The dependency here is that not all APIs currently include
streams, but they’d need to support it to benefit from that
- Andrew: There’s also an ArrayBuffer entry point. We might also add a
DOMString entry point in the future
- Todd: Want to avoid web sites doing wasteful copies. Stream help devs
do the right thing.
- Andrew: Right. That was the motivation behind making the ArrayBuffer
transferable. Devs would need to deliberately copy to shoot themselves in
the foot.
- Todd: Like it conceptually, but concerned it will cause footguns in
the real world.
- Andrew: Maybe we should name it to make it explicit that the
ArrayBuffer is transferable.
- Ryosuke: Primary use case is to send this back to the server, why
isn’t this an option in Fetch?
- Andrew: Wanted to make it general purpose. There are cases where you
want compress localStorage, etc. Streams make it very easy to compose, also
into Fetch.
- Ryosuke: But now everything is single threaded, you have to go back to
the main thread
- Andrew: Not necessarily. You may need a sync spot on the main thread
to spawn it off.
- Yoav: why?
- Andrew: Actually, readable streams should enable passing compressed
things to fetch() without going through the main thread
- Ryosuke: Oh, so in the example you’re waiting, but it’s not really
needed.
- Andrew: Yeah, we’d only need to await in the ArrayBuffer case
- Ryosuke: still better to do this in Fetch
- Yoav: We have use-cases to have that separately - enables flexibility
on client side storage as well as mixing payloads that require different
algorithms
- Ryosuke: I see your point. But how does the decoder know how to
decode? Headers?
- Andrew: The server will have to interpret it and decode on the server
side. No headers involved. So the server would need to always expect
certain codecs, or insert headers manually. Doesn’t seem far-fetched that
you’d want to always assume compressed data.
- Ryosuke: That assumed you’re writing your servers. Standard servers
won’t help you. You’d need to manually deal with decoding.
- Yoav: Sure, but no standard server right now decompresses the payload
(e.g. if you use content-encoding on the upload).
- Nic: You can convince Apache, but it’s not on by default
- Yoav: Yeah, so it requires some negotiation between the client and the
server. Whereas here if the developer knows the backend collecting the
data, handling the payload is their responsibility.
- Nic: Want to add my support for this. Use a lot of CPU and ship code
to compress RT and UT data. The browser can better do that: either stream
based approach of a Fetch option
- Todd: What ryosuke says is that server will eventually support this,
so value to explore this. No reason for that not being the default.
- Andrew: There’s value in exploring fetch as a separate topic. useful
to have both.
- Todd: useful to have the ability to compress separately. Maybe also
useful to have a unique string that represents the input and is
standardized and reusable, that goes with the data and informs the decoder
how it was compressed. (regardless of decoder on the server, in
localstorage after version changes, etc)
- Yoav: that seems useful
- Andrew: sounds good. Many use cases where one would vary the
compression format?
- Yoav: There are cases where you can invest more processing power in
some parts of the stream (e.g. brotli 11), but other parts you want to
compress on the fly. But could be tackled in separate streams that are
combined.
- Andrew: Yeah, making it general enough should be good enough
- Ryosuke: some compression algorithms may be more efficient than
others. A single algo does not cover everything. Maybe need something like
Web Media where there’s a preferred algorithm based on compression
efficiency, power efficiency.
- Andrew: Sounds good. May want to provide abstractions across
algorithms, or to have each codec take its own “bag of flags”.
- Yoav: bag of flags is probably better, as there are many knobs
involved beyond compression level. E.g. in order to implement SSH with gzip
you need specific flushing parameters.
- Yoav: Also, dictionaries. Brotli and gzip can have external
dictionaries that can help reduce compressed data for specific formats and
use cases. Downloading a dictionary can help upload transfers.
- Andrew: Definitely thought of external dictionaries. Thought it would
be tackled by “bag of flags”. Not sure how to persist them.
- Yoav: Yeah, I thought of user land persistency.
- Andrew: Anything beyond “bag of flags” for this use-case?
- Yoav: Not sure. It’d be a large bag...
- Phil: For the RUM/analytics case, making it async would be a burden
and not usable in beforeunload. An option to Fetch would be better along
with keep-alive, letting the browser do that work off-main-thread.
- Yoav: There’s a real extensible web play here, we need to provide the
primitives *and* integrate them into fetch()
- Ben: around compression levels, we would prefer to pick the algorithm
and its parameters.
- Andrew: bag of flags it is!

Element Timing and text aggregation - Nicolás

-
https://docs.google.com/document/d/1WWFxpuLpbMs3Jri1b2Y6O0rNldv0FAFcMDe-IGGuQ54/edit
<https://www.google.com/url?q=https://docs.google.com/document/d/1WWFxpuLpbMs3Jri1b2Y6O0rNldv0FAFcMDe-IGGuQ54/edit&sa=D&ust=1562869258458000>
- Nicolás: wanted to talk about the way we tackled text aggregation

- High level problem: ET wants to expose “important” text content, but
since text nodes are not elements, we need to specify which text nodes
belong
- Considered several approaches and ended up with aggregating a text
node to the nearest containing block ancestor

- Notion of depth: arbitrary, links increase depth, but shouldn’t have
that impact
- Top level elements: high maintenance
- Phrasing content: could work, but harder to implement, so containing
block was chosen

- *shows examples*

- Todd: notions are standardized, right?
- Tim: Yeah, also devs know blocking elements way more than phrasing
content
- Will: If a dev wants to learn something on a specific text, can they
wrap it and annotate it? Is that the idea?
- Nicolás: block level element would change layout
- Tim: we should think about that, that’s a great question
- Will: want to see if devs can specify an override
- Yoav: would a span work
- Tim: currently it won’t work, we tried to avoid reporting things
multiple times, but maybe we should. We need to figure it out
- Nicolás: I’d think that the problem is often the other way around.
Text nodes are generally very small. Not sure if this is a real problem,
but worth looking into it

On Mon, Jul 8, 2019 at 11:37 AM Yoav Weiss <yoav@yoav.ws> wrote:

> Hey all,
>
> Please join us <https://meet.google.com/agz-fbji-spp> on this week's
> design call where we'll talk about
> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.gw8uh4vpahqn>
> :
>
> - Gargantua, the Working Group dashboard project (plh@w3.org)
> - Upload Compression (acomminos@fb.com)
> - Element Timing and text aggregation (npm@google.com)
>
> As a reminder, the call will be recorded and posted online.
>
> See y'all there! :)
> Yoav
>

Received on Thursday, 11 July 2019 17:25:31 UTC