Re: DOM Mutation Events Replacement: Findings from implementing "net-effect" projections from Rafael Weinstein on 2011-08-17 (public-webapps@w3.org from July to September 2011)

From: Rafael Weinstein <rafaelw@google.com>
Date: Wed, 17 Aug 2011 05:03:33 -0700
To: olli@pettay.fi
Cc: Webapps WG <public-webapps@w3.org>
Message-ID: <CABMdHiRPC70rV5us-ywDX2vEyP_yczy72jW-T8312k=yLqdaeA@mail.gmail.com>
On Wed, Aug 17, 2011 at 3:17 AM, Olli Pettay <Olli.Pettay@helsinki.fi> wrote:
> On 08/17/2011 04:54 AM, Rafael Weinstein wrote:
>>
>> TL;DR;
>>
>> 1) ObserveSubtree semantics doesn't provide a robust mechanism for
>> observing a tree/fragment, and if we don't provide something more
>> complete, libraries will likely register observers at every node in
>> the document.
>
> ModificationBatch approach should provide as much information as
> current mutation events.
>
>
>
>>
>> 2) Not providing position information (e.g childlistIndex) in
>> "ChildlistChanged" mutations means that the algorithmic complexity of
>> computing whether/where nodes have moved to doesn't improve vs. having
>> to bruce-force compute.
>
> Yeah, I think we really to provide some position information.
> Adding index is easy (although it can be slow in some implementations,
> but that is implementation specific thing)
> I'll add this to ModificationBatch.
>
>>
>> 3) Not providing "lastValue" for text changes (attribute, textNode,
>> etc...) may adversely impact some use cases.
>
> I agree.
>
>
>
>
>>
>> -----------
>> Following up on points 6&  7 from
>> http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/0779.html
>> ...
>>
>> I've prototyped the main "net-effect" projections that I believe
>> various use cases will need. I've published to code to this public
>> github repository:
>>
>> https://github.com/rafaelw/MutationProjections
>>
>> I encourage anyone interested to check my work and make corrections
>> and/or suggestions, or create a branch that experiments in any
>> direction they see fit.
>>
>> -----------
>> Per point (3) in the email above, we've concluded that mutations
>> should be delivered in batches, since any number of mutations can have
>> taken place since an observer was notified.
>>
>> Also, though it varies, there's generally a hierarchy of expense for
>> types of operations done from JS:
>>
>>   reads/writes on "pure" JS data<  DOM Reads<  DOM Writes<  External I/O
>>
>> Often, an application is willing to do a some amount of work of a less
>> expensive operation to avoid doing any work of a more expensive
>> operation.
>>
>> Thus, a mutation observer is highly motivated, for both correctness
>> and performance, to answer the question:
>>
>>   What is the net-effect of some set of DOM changes that have taken
>> place since I last got to run?
>>
>> -----------
>> Depending on the use case, "what happened" likely means some
>> combination of the following:
>>
>> 1) Reachability: What nodes were added to or removed from the document?
>>
>> 2) Matching: What nodes stopped or started "matching" a given pattern?
>> This could be arbitrarily complex, but in the use cases discussed,
>> this usually means a simple selector.
>>
>> 3) Parent changes: What nodes are now children of a new parent node?
>>
>> 4) Movement: What nodes moved to a new position in the document?
>>
>> 5) Value change: What character/text or attribute nodes changed value
>> (for attributes, also, whether the attribute was added or removed)?
>>
>> Note that none of the above requires any mechanism of mutation
>> observation. All can be computed given the opportunity to inspect the
>> DOM at two points in time.
>>
>> However, without any information about what happened, the expense of
>> answering each of the above questions is roughly proportional to the
>> number of nodes in the document. All are roughly linear, except (4)
>> Movement, which is roughly quadratic.
>>
>> It seems to me that a good goal for any mutation event replacement
>> should be that the expense of answering the above projections be
>> proportional the number of mutations which actually took place.
>>
>> I've implemented the "null case" (lacking any mutation information)
>> for 1, 3&  4 here:
>>
>>
>> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L39
>>
>> And the same projections, but taking mutation records to improve
>> performance here:
>>
>>
>> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L240
>>
>> -----------
>> Observing sets of nodes
>>
>> In order to implement the above projections, I prototyped:
>>
>>   Document.observeOwnedNodes
>>
>> I'm not proposing this (yet), but I think it is conceptually the most
>> general case (btw, to give credit to Jonas, he pointed out that this
>> might be solution to the problem I'm about to describe, though I'm
>> pretty sure he meant "sometime down the road").
>
> I'm not quite sure what all the Document.observeOwnedNodes approach
> tries to observe. I assume it should be observing everything, all the
> changes to child nodes and also changes to attributes.
> (currently it handles only elements)
> Is that really needed?

It's not fully implemented in the shim I created, in that it only
reports ChildlistChanged mutations.

It's purpose is to provide a single registration that allows you to be
notified of all changes that happen to all nodes owned by this
document, regardless of whether they are current in the document.

>
>
>>
>> The problem is that observeSubtree doesn't report anything about what
>> happened to a node that was removed from the document, modified, then
>> returned to the document. Consider the following case:
>>
>> -Register a subtreeObserver at the root of the document
>> -Some "other" code removes an element, appends a new element to it,
>> modifies an attribute, then returns the node to the document.
>>
>> Now, in order to properly compute the above projections, we are nearly
>> back to the null case (matching becomes proportional the number of
>> descendants of added/removed nodes that we heard about, but everything
>> else returns to being proportional to the number of nodes in the
>> document).
>
> But is that a common enough case which the API needs to handle.
> I would think that a script library which wants to handle such case,
> can just use the API to observe "everything".
>
> reading what is written below...
>
>>
>> You may consider this to be an esoteric case, but imagine the
>> situation that library authors will be in. They will have two basic
>> options:
>>
>> a) Use observeSubtree and explain to their users that they have to be
>> careful never to modify something outside the tree.
>
> Well, why would they need to be careful? If something is modified outside
> the tree, and then added to tree, they just handle that whole
> change at once. And that is how mutation events work atm, so I assume
> scripts can handle this case.

If it's possible to properly implement getAdded() & getRemoved() (i.e.
what elements have entered or left the document) with a single
"subtree observation" at the document root, I don't know how.

This is where the tests can help:

https://github.com/rafaelw/MutationProjections/blob/master/test.html

I'm happy to be shown what I've missed. You can do so by implementing
getAdded()/getRemoved() which operate over a sequence of mutation
records that doesn't contain records for changes that occurred outside
the document.

> Also, things like execCommand() or contentEditable may create subtrees
> outside document and then insert a subtree to document. Script will need
> to handle that case (I'm pretty sure we don't want to fire mutation
> notifications when browser engine is constructing some subtree to be
> inserted later to document).
> Or one can just create elements/subtree in one document and
> insert it to another document.
>
>
>> b) Simply register an observer at all nodes in the document.
>>
>> Picking the second option means that my library works properly and I
>> don't need my users to be careful. It burns plenty of memory,
>> but
>> library authors aren't well known for being conservative in that
>> regard -- especially since its pretty hard for them to quantify the
>> memory impact of anything.
>
> b) + ModificationBatch (or other similar approach) isn't really that
> bad mem-vise, if someone really needs to observe everything.
>
>
>>
>> The danger is that if we don't offer a better mechanism, we'll create
>> a common situation that every observer in use has been registered at
>> every node in the document.
>
> Why would that be common?
>
>
>>
>> -----------
>> Complexity impact of providing/witholding various data:
>>
>> Assuming that projections can observe *at least* the set of nodes in
>> the document, even if they are temporarily removed, we can focus on
>> the complexity impact of including/excluding various bits of data in
>> mutation notifications.
>>
>> The good news is that all recent proposals *mostly* meet the goal for
>> the above projections of being roughly proportional to the number of
>> mutations. There's one big exception and then a few things to note:
>>
>> -----------
>> Movement:
>>
>> Jonas' most recent proposal pre-summarizes what has happened to a
>> given element (keeping a running set of added&  removed nodes). Doing
>> so discards position information about where nodes were added or
>> removed. Having position information makes the complexity of computing
>> movement roughly quadratic WRT the number of mutations that occurred:
>>
>>
>> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L487
>>
>> Lacking it means that it continues to be roughly quadratic WRT the
>> number of nodes in the document:
>>
>>
>> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L163
>>
>> -----------
>> Value Change&  lastValue
>>
>> Excluding lastValue impacts (2) Matching in that wasMatching() cannot
>> be directly computed. The projection must store the set of nodes which
>> have already been matched. The danger here probably isn't the amount
>> of memory this burns as it is that it creates another mechanism by
>> which pages can easily prevent garbage collection of (otherwise)
>> unreferenced nodes.
>>
>> Excluding lastValue impacts (5) Value change in that the old value
>> must have been saved in order to prevent reporting that the value
>> changed when it did not. This is probably only bad for use cases that
>> care about all or most text nodes in the document (e.g. tree
>> mirroring, possibly editing).
>>
>> -----------
>> Reachability&  wasInDocument.
>>
>> In all current proposals, computing wasReachable and isReachable for a
>> given node is O(log n) in the number of nodes in the tree. We could
>> make this constant-time by simply including a flag with
>> ChildistChanged that indicates whether the target node wasInDocument
>> at the time of the mutation.
>>
>> Doing so, however, probably only makes sense if we also expose a
>> constant-time method of asking any node whether it isInDocument(),
>
> node.isInDocument would be useful anyway, and IIRC I've asked Ms2ger to
> think about adding it to Web DOM Core.
>
>
> -Olli
>
>
>
>> as
>> both isReachable and wasReachable() are needed to compute which nodes
>> were added/removed from the document:
>>
>>
>> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L315
>>
>>
>
>
Received on Wednesday, 17 August 2011 12:04:01 UTC