Re: DOM Mutation Events Replacement: Findings from implementing "net-effect" projections from Olli Pettay on 2011-08-17 (public-webapps@w3.org from July to September 2011)

From: Olli Pettay <Olli.Pettay@helsinki.fi>
Date: Wed, 17 Aug 2011 13:17:49 +0300
To: Rafael Weinstein <rafaelw@google.com>
CC: Webapps WG <public-webapps@w3.org>
Message-ID: <4E4B954D.2000803@helsinki.fi>
On 08/17/2011 04:54 AM, Rafael Weinstein wrote:
> TL;DR;
>
> 1) ObserveSubtree semantics doesn't provide a robust mechanism for
> observing a tree/fragment, and if we don't provide something more
> complete, libraries will likely register observers at every node in
> the document.

ModificationBatch approach should provide as much information as
current mutation events.



>
> 2) Not providing position information (e.g childlistIndex) in
> "ChildlistChanged" mutations means that the algorithmic complexity of
> computing whether/where nodes have moved to doesn't improve vs. having
> to bruce-force compute.

Yeah, I think we really to provide some position information.
Adding index is easy (although it can be slow in some implementations,
but that is implementation specific thing)
I'll add this to ModificationBatch.

>
> 3) Not providing "lastValue" for text changes (attribute, textNode,
> etc...) may adversely impact some use cases.

I agree.




>
> -----------
> Following up on points 6&  7 from
> http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/0779.html
> ...
>
> I've prototyped the main "net-effect" projections that I believe
> various use cases will need. I've published to code to this public
> github repository:
>
> https://github.com/rafaelw/MutationProjections
>
> I encourage anyone interested to check my work and make corrections
> and/or suggestions, or create a branch that experiments in any
> direction they see fit.
>
> -----------
> Per point (3) in the email above, we've concluded that mutations
> should be delivered in batches, since any number of mutations can have
> taken place since an observer was notified.
>
> Also, though it varies, there's generally a hierarchy of expense for
> types of operations done from JS:
>
>    reads/writes on "pure" JS data<  DOM Reads<  DOM Writes<  External I/O
>
> Often, an application is willing to do a some amount of work of a less
> expensive operation to avoid doing any work of a more expensive
> operation.
>
> Thus, a mutation observer is highly motivated, for both correctness
> and performance, to answer the question:
>
>    What is the net-effect of some set of DOM changes that have taken
> place since I last got to run?
>
> -----------
> Depending on the use case, "what happened" likely means some
> combination of the following:
>
> 1) Reachability: What nodes were added to or removed from the document?
>
> 2) Matching: What nodes stopped or started "matching" a given pattern?
> This could be arbitrarily complex, but in the use cases discussed,
> this usually means a simple selector.
>
> 3) Parent changes: What nodes are now children of a new parent node?
>
> 4) Movement: What nodes moved to a new position in the document?
>
> 5) Value change: What character/text or attribute nodes changed value
> (for attributes, also, whether the attribute was added or removed)?
>
> Note that none of the above requires any mechanism of mutation
> observation. All can be computed given the opportunity to inspect the
> DOM at two points in time.
>
> However, without any information about what happened, the expense of
> answering each of the above questions is roughly proportional to the
> number of nodes in the document. All are roughly linear, except (4)
> Movement, which is roughly quadratic.
>
> It seems to me that a good goal for any mutation event replacement
> should be that the expense of answering the above projections be
> proportional the number of mutations which actually took place.
>
> I've implemented the "null case" (lacking any mutation information)
> for 1, 3&  4 here:
>
> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L39
>
> And the same projections, but taking mutation records to improve
> performance here:
>
> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L240
>
> -----------
> Observing sets of nodes
>
> In order to implement the above projections, I prototyped:
>
>    Document.observeOwnedNodes
>
> I'm not proposing this (yet), but I think it is conceptually the most
> general case (btw, to give credit to Jonas, he pointed out that this
> might be solution to the problem I'm about to describe, though I'm
> pretty sure he meant "sometime down the road").

I'm not quite sure what all the Document.observeOwnedNodes approach
tries to observe. I assume it should be observing everything, all the 
changes to child nodes and also changes to attributes.
(currently it handles only elements)
Is that really needed?


>
> The problem is that observeSubtree doesn't report anything about what
> happened to a node that was removed from the document, modified, then
> returned to the document. Consider the following case:
>
> -Register a subtreeObserver at the root of the document
> -Some "other" code removes an element, appends a new element to it,
> modifies an attribute, then returns the node to the document.
>
> Now, in order to properly compute the above projections, we are nearly
> back to the null case (matching becomes proportional the number of
> descendants of added/removed nodes that we heard about, but everything
> else returns to being proportional to the number of nodes in the
> document).

But is that a common enough case which the API needs to handle.
I would think that a script library which wants to handle such case,
can just use the API to observe "everything".

reading what is written below...

>
> You may consider this to be an esoteric case, but imagine the
> situation that library authors will be in. They will have two basic
> options:
>
> a) Use observeSubtree and explain to their users that they have to be
> careful never to modify something outside the tree.

Well, why would they need to be careful? If something is modified 
outside the tree, and then added to tree, they just handle that whole
change at once. And that is how mutation events work atm, so I assume
scripts can handle this case.
Also, things like execCommand() or contentEditable may create subtrees
outside document and then insert a subtree to document. Script will need
to handle that case (I'm pretty sure we don't want to fire mutation 
notifications when browser engine is constructing some subtree to be
inserted later to document).
Or one can just create elements/subtree in one document and
insert it to another document.


> b) Simply register an observer at all nodes in the document.
>
> Picking the second option means that my library works properly and I
> don't need my users to be careful. It burns plenty of memory,
> but
> library authors aren't well known for being conservative in that
> regard -- especially since its pretty hard for them to quantify the
> memory impact of anything.

b) + ModificationBatch (or other similar approach) isn't really that
bad mem-vise, if someone really needs to observe everything.


>
> The danger is that if we don't offer a better mechanism, we'll create
> a common situation that every observer in use has been registered at
> every node in the document.
Why would that be common?


>
> -----------
> Complexity impact of providing/witholding various data:
>
> Assuming that projections can observe *at least* the set of nodes in
> the document, even if they are temporarily removed, we can focus on
> the complexity impact of including/excluding various bits of data in
> mutation notifications.
>
> The good news is that all recent proposals *mostly* meet the goal for
> the above projections of being roughly proportional to the number of
> mutations. There's one big exception and then a few things to note:
>
> -----------
> Movement:
>
> Jonas' most recent proposal pre-summarizes what has happened to a
> given element (keeping a running set of added&  removed nodes). Doing
> so discards position information about where nodes were added or
> removed. Having position information makes the complexity of computing
> movement roughly quadratic WRT the number of mutations that occurred:
>
> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L487
>
> Lacking it means that it continues to be roughly quadratic WRT the
> number of nodes in the document:
>
> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L163
>
> -----------
> Value Change&  lastValue
>
> Excluding lastValue impacts (2) Matching in that wasMatching() cannot
> be directly computed. The projection must store the set of nodes which
> have already been matched. The danger here probably isn't the amount
> of memory this burns as it is that it creates another mechanism by
> which pages can easily prevent garbage collection of (otherwise)
> unreferenced nodes.
>
> Excluding lastValue impacts (5) Value change in that the old value
> must have been saved in order to prevent reporting that the value
> changed when it did not. This is probably only bad for use cases that
> care about all or most text nodes in the document (e.g. tree
> mirroring, possibly editing).
>
> -----------
> Reachability&  wasInDocument.
>
> In all current proposals, computing wasReachable and isReachable for a
> given node is O(log n) in the number of nodes in the tree. We could
> make this constant-time by simply including a flag with
> ChildistChanged that indicates whether the target node wasInDocument
> at the time of the mutation.
>
> Doing so, however, probably only makes sense if we also expose a
> constant-time method of asking any node whether it isInDocument(),
node.isInDocument would be useful anyway, and IIRC I've asked Ms2ger to
think about adding it to Web DOM Core.


-Olli



> as
> both isReachable and wasReachable() are needed to compute which nodes
> were added/removed from the document:
>
> https://github.com/rafaelw/MutationProjections/blob/master/projections.js#L315
>
>
Received on Wednesday, 17 August 2011 10:18:17 UTC