- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Wed, 27 Jul 2011 16:47:43 -0400
(answering some old feedback on DOM Range that Hixie pointed me to) On Tue, Jun 15, 2010 at 6:52 AM, Andrew Oakley <andrew at ado.is-a-geek.net> wrote: > I've been trying to implement DOM Range but can't work out how ranges > are supposed to work under mutation. This should now be more or less fully defined in the DOM Range spec, with a pretty decent test suite: http://html5.org/specs/dom-range.html#range-behavior-under-document-mutation I wrote up the new definitions a couple of months ago. They aim to be both precise and compatible with browser behavior, which means they mostly match DOM 2 Range but differ in some respects. > In the following examples I use *this* to indicate a range being deleted > and slashes to indicate another range. > > > Section 2.6 - Deleting Content with a Range gives the example of > > <FOO>X*Y<BAR>Z*W</BAR>Q</FOO> -> <FOO>X^<BAR>W</BAR>Q</FOO> > > > Section 2.12.2 - Deletions says: > > "If a boundary-point of the original Range is within the content being > deleted, then after the deletion it will be at the same position as the > resulting boundary-point of the (now collapsed) Range used to delete the > contents." This is not what browsers do, and not what the new DOM Range spec requires. DOM 2 Range treats deletions as deletions of ranges, but browsers and DOM Range both treat deletions as node-by-node. deleteContents() specially modifies the range you call it on so that it's always collapsed, as is defined in detail: http://html5.org/specs/dom-range.html#dom-range-deletecontents Note how the last step is "Set the context object's start and end to (new node, new offset)", so the range you call the method on is changed differently from other ranges. If you have a range <FOO>X[Y<BAR>Z]W</BAR>Q</FOO> (using [] to denote the endpoints), then the algorithm works as follows: * "If original start node is an ancestor container of original end node, set new node to original start node and new offset to original start offset." Original start node here is the Text node "XY", and original end node is the Text node "ZW". The former is neither equal to nor an ancestor of the latter, so this doesn't apply, and we go to the other branch. * "Let reference node equal original start node." So reference node is now the Text node "XY". * "While reference node's parent is not null and is not an ancestor container of original end node, set reference node to its parent." Reference node's parent is <FOO>, which is not null, but is an ancestor container of original end node. Thus we do nothing in this step. * "Set new node to the parent of reference node, and new offset to one plus the index of reference node." Thus new node is <FOO>, and new offset is 1. So the Range you delete will eventually collapse to <FOO>X{}<BAR>W</BAR>Q</FOO>. Note that here I use curly braces instead of brackets, to indicate that the endpoint of the Range is in an Element node, not a Text node. The old DOM 2 Range standard is unclear on that point, but my spec matches what browsers do. > We then have the example of: > > <P>ABCD *efgh The <EM>R*ange</EM> ijkl</P> > ? ? ? ? ? ? ?/ ? ? ? ? ? ?\ > > Goes to > > <P>ABCD <EM>ange</EM> ijkl</P> > ? ? ? ? ? / ? ?\ In the syntax I'm using, that's: <P>ABCD [efgh T[[he <EM>R]ange]]</EM> ijkl</P>, where I use single brackets for the range being deleted and double brackets for the other, for lack of better syntax. The new specification uses entirely different rules when the Range being deleted is different from the one being modified, as I noted. The deletion is treated as a sequence of separate mutations of individual nodes. In this case, deleteContents() will do the following: 1) Call deleteData() on the Text node "ABCD efgh The ", with offset 5 and count 9. This deletes "efgh The " and leaves only "ABCD ". Current DOM Core defines this as replacing data with offset 5, count 9, and data "", so we look at the "When something replaces data of a CharacterData node" case at <http://html5.org/specs/dom-range.html#range-behavior-under-document-mutation>. The first boundary point of the [[ range has offset 11, and 5 < 11 <= 5 + 9, so we hit the case "For every boundary point whose node is node, and whose offset is greater than offset but less than or equal to offset plus count, set its offset to offset." Thus the offset is set to offset, i.e., 5. This gives us: <P>ABCD [[<EM>Range]]</EM> ijkl</P> 2) Call deleteData() on the Text node "Range", with offset 0 and count 1. This deletes "R" and leaves "ange". We're replacing data with offset 0, count 1, and data "", and the second boundary point of the ]] range has offset 5, and 5 > 0 + 1, so we hit the case "For every boundary point whose node is node, and whose offset is greater than offset plus count, add the length of data to its offset, then subtract count from it." The length of data is 0 and count is 1, so we set the new offset to 5 + 0 - 1 = 4. This gives us: <P>ABCD [[<EM>ange]]</EM> ijkl</P> The example in DOM 2 Range implies something more like <P>ABCD <EM>[[ange]]</EM> ijkl</P>. I agree this is wrong according to DOM 2 Range itself. DOM 2 Range is a decent spec for its time, but we've moved to much greater levels of precision these days. One thing it often does is not clearly distinguishing boundary points that "look" the same, in that no nodes or characters lie between them. > I assume that the range indicated by the underline in the spec and like > *this* here collapses to just before the <EM> tag as this document has > the same structure as the other example I pulled out of the spec. ?This > would mean that the start point of the other range should also be just > before the <EM>, but that isn't what has happened in this example. The example is buggy, yes. The starting <EM> tag should be highlighted according to both specs and according to browser behavior. > Any idea what I've got wrong? ?Some browsers (e.g. Safari) seem to > behave as in the example, others (e.g. Firefox) put the end point before > the <EM> (as I would have expected). Here's a test case: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1086 Firefox 7.0a2, Chrome 14 dev, and Opera 11.50 all log "ABCD", "5", "ange", "4", which matches my spec. IE10PP2 logs "ABCD", "5", "undefined", "1". The "undefined" winds up being because it puts the new endpoint in the <em> with offset 1, instead of in the Text node "ange" with offset 4. IE might or might not be able to argue that it's correct per DOM 2 Range, but it's not correct according to the new spec. I have a reasonably comprehensive test suite for range mutation behavior, by the way: http://aryeh.name/spec/dom-range/test/Range-mutations.html It only tests what happens with basic DOM operations like replaceData, though, it doesn't check if things like Range.deleteContents do any additional magic.
Received on Wednesday, 27 July 2011 13:47:43 UTC