Re: [whatwg/dom] Proposal: DOMChangeList (#270) from Yehuda Katz on 2016-06-20 (public-webapps-github@w3.org from June 2016)

From: Yehuda Katz <notifications@github.com>
Date: Mon, 20 Jun 2016 11:41:37 -0700
To: whatwg/dom <dom@noreply.github.com>
Cc:
Message-ID: <whatwg/dom/issues/270/227231409@github.com>
> Great to see this! I'm coming at this in comparison to Dru's proposal at https://github.com/drufball/async-append/blob/master/EXPLAINER.md

I also reviewed that in detail before I began work on this proposal.

> As far as I can tell this proposal focuses mostly on trying to minimize allocations

That is certainly an important characteristic of this proposal. It also has several other goals, both related to efficiency and expanding capabilities. (one example is supporting the superset of the HTML syntax and the DOM API, which I talk about in the proposal; another is allowing the end-developer to directly control the staging of work across threads by using workers).

> (but see below)

I'll reply inline.

> and on providing a proxy API that can be used in a worker

I don't think you should see ChangeList as a "proxy", but rather as a collection of instructions that can be freely transferred until it's applied.

> instead of focusing on allowing chunking and scheduling of parsing/style/layout/paint as Dru's proposal does

Since the application of a `ChangeList` is asynchronous, the benefits to the browser's scheduling algorithm is more-or-less equivalent to Dru's proposal. `ChangeList` supports many more operations, and allows them to be performed in a batch, which should further increase the scope of the benefit to authors.

> I can't tell at first glance whether it can serve both purposes, or whether its design prevents that. 

This is something I spent a lot of time talking to @lukewagner, @smaug--- and @annevk about, and our intent was for this API to serve both purposes. As I said in the proposal: "If there is some reason that a concrete implementation of this design might not be able to accomplish these goals, it's almost certainly something we should discuss."

> There's a brief paragraph under "Is this actually faster?" number 3 that indicates it is intended to be compatible, but I'll defer to the Blink engineers.

I'm very eager to hear from them about this. I'd be very happy to have a call with anyone interested in this area as well.

> I think the proposal might have a misapprehension about how DOM nodes and wrappers are related. Remember that wrappers are not created until they are accessed.

This particular constraint was one of the most important aspects of this design, and I spent a lot of time thinking about it and discussing the details with others.

The intent of `NodeToken` is that it can be represented as a simple internal value (e.g. an representing the n'th object to be allocated in this transaction) but that the nature of that representation is not directly exposed to users. The only way in which these values can be reified into `Node`s is through `AppliedChanges`. I've spoke with some engineers at Mozilla about how to structure the API to allow it to work as a simple value, and would love to discuss it further with engineers who work on Blink.

> Looking at the example code seems to bear this out; counting wrapper allocations it seems to have the same amount as normal DOM manipulation code. (And remember a wrapper is just a pointer; a wrapper to a NodeToken and a wrapper to a Node are both the same cost.) If you assume the DOM nodes will eventually exist anyway, as Yehuda discusses, Yehuda's proposal actually seems to have more allocations: NodeToken backing + NodeToken wrapper + Node backing, vs. Dru's proposal which has Node wrapper + Node backing.

> Both proposals make it clear and less error prone to apply a sequence of DOM operations at once, through similar queue-then-commit systems.

> Both support the full gamut of mutations to Elements and Nodes, with Dru's doing so by just allowing you to use those normal DOM APIs and Yehuda's by creating new versions of (most of) them that operate on analogous data structures.

It looks like [Dru's proposal](https://github.com/drufball/async-append/blob/master/EXPLAINER.md) creates several new async APIs for existing operations. It also postulates a "wrapper" that could be used with the regular APIs. I understood Dru's proposal as an early exploration with several different possible directions; can you flesh out which of those directions your comments are talking about?

> Both support the union of the trees that can be produced using the HTML parser and using the DOM API, Dru's by using the HTML parser directly (innerHTML/outerHTML) and Yehuda's presumably by e.g. allowing openElement to take more element names than createElement does (although I couldn't find where this is specified). We could of course try to expand createElement again; nobody's had the motivation so far, but maybe this will be the last straw.

>From my original statement about the union of trees: "the HTML parser supports more tag names, while the DOM API supports more trees, such as custom elements nested inside a table)."

DOMChangeList also supports using the HTML parser, but the HTML parser limits which nodes can be made children of certain elements. We could perhaps fix `createElement` to support the full set of element names supported by the HTML parser, and that would address this benefit of the ChangeList proposal.

> To me it feels like an empirical question whether the gains from building the parallel DOM-like tree in the worker instead of building a DOM tree in the main thread outweigh the costs of transferring it. There's also the empirical question of whether building a DOM tree in the main thread actually takes long enough to cause jank;

I have heard this thinking before and I think it might misapprehend the reasons people want to do build DOM trees and calculate DOM mutations in a worker.

It's not so much shrinking the cost of the DOM work, but rather all of the JavaScript work that is necessary to calculate the needed mutations. In practice, a lot of application work is done on the UI thread simply because of the close proximity of the DOM APIs.

> It is of course possible to implement some of this strategy as a user-space library, but not all of it. In user-space, applying a change set is inherently single-threaded, for example. The process of deserializing a transferred ChangeList would have to choose between interleaving the DOM operations or deserializing them into an intermediate data structure, both of which would introduce new costs that the native implementations would not need.
>
> And that's all assuming that a user-space library has enough primitives to do an efficient job at the limit. The basic idea behind this API is to provide a low-level primitive that libraries can use to build up different useful libraries with different tradeoffs, and start lower level than the current set of primitives.

---

There are many small paper-cuts that the ChangeList proposal addresses in one, fairly contained proposal. It attempts to create an API that, when used, has noticeably fewer common gotchas for web developers, and does not require library, framework or application developers to reverse engineer which APIs produce hazards, and keep that knowledge up to date as implementations change.

I am especially interested in feedback from implementors that there is something about this proposal that is difficult to implement, or that would make reasonable initial implementations slower than the equivalent use of DOM APIs. Any suggested changes to this proposal that would improve 

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/dom/issues/270#issuecomment-227231409
Received on Monday, 20 June 2016 18:42:15 UTC