Re: [csswg-drafts] [cssom-view] Change or clarify Range getClientRects() behavior when collapsed in an empty element.

I would like to argue for a change in the spec with the following intent:
Range.getClientRects() should behave generally the same before and after node
normalization.

Node.normalize() maintains live ranges over normalization (although the spec
https://dom.spec.whatwg.org/#dom-node-normalize is not very clear if
normalization should maintain ranges pointing to empty text nodes that it
removes, Chrome and Firefox do maintain such ranges, see
http://jsbin.com/goqegikede/1/edit?html,js,output). Normalization also
maintains the "child text content". It would be a good thing to further that
principle.


== Background

Currently, as pointed out by the OP, the spec allows (somewhat implicitely)
to return an empty rect list in many cases. In particular:
- if a collapsed range selects the inside of an empty element;
- if a collapsed range selects between two nodes (elements or text nodes);
- if a range partially selects an element (or elements) but does not
  select a text node (partially or not).

Chromium (deliberately) returns empty lists in these cases.

Chromium has at least 2 issues filed that concern this aspect of the spec:
https://bugs.chromium.org/p/chromium/issues/detail?id=830044 (by shiivan above)
https://bugs.chromium.org/p/chromium/issues/detail?id=637296

As it is, for developers, the API is difficult to use. Any range built
programmatically must be carefully crafted, as many collapsed and non-collapsed
ranges will produce an empty getClientRects(). The clearest workaround that
should work with the current spec is to insert an empty text node where the
range points to, mutate the range to select the empty text node, and get the
client rects list on that. (Inserting a non-empty text node could have a
cascading effect on the layout.)

However, Chromium has a bug (in my opinion)
(https://bugs.chromium.org/p/chromium/issues/detail?id=839987) that prevents
using empty text nodes as a way to force a non-empty rect list: Chromium
returns an empty list for empty text nodes. The current spec does not make any
exception for empty text nodes. Firefox follows Chromium at least some of the
time (but not always!).

Therefore, in practice, if a range falls outside of any text nodes and does not
fully select an element, or falls within an empty text node, to get a usable
client rect for it, one must find a suitable nearby element or non-empty text
node to obtain some rect list, and derive from that a suitable substitute rect
for the original range. Needless to say, this is rather non-obvious in general,
especially if the styling is not locally consistent.

To illustrate what I'm on about, here is the length of the rect list returned
by range.getClientRects() in various situations:

(code at http://jsbin.com/niwuhoxire/edit?html,js,output)

```
Spec Chrome iOS Saf Firefox     Bad
0 0 1 0 <div>()</div>   iS
                                
1 0 0 0 <div>("")</div>   Cr,iS,FF
1 0 0 0 <div>(")"</div>   Cr,iS,FF
1 0 0 1 <div>"()"</div>   Cr,iS
                                
1 0 0 1 <div>" () "</div>  Cr,iS
                                
1 1 1 1 <div>"a ()  b"</div>
1 0 0 1 <div>"a  () b"</div>  Cr,iS being funny [*]
1 1 1 1 <div>"a   ()b"</div>
                                
0 0 0 0 <div>()"abc"</div>
1 1 1 0 <div>(")abc"</div>  FF
1 1 1 1 <div>"()abc"</div>
                                
0 0 0 0 <div>"abc"()"""def"</div>
1 1 0 0 <div>"abc(")"""def"</div> iS,FF
1 1 0 0 <div>"abc("")""def"</div> iS,FF
1 1 0 1 <div>"abc("""")"def"</div> iS
3 2 2 3 <div>("abc""""def")</div> Cr,iS
                                
0 0 1 1 <div>(<div>)"abc"</div></div> iS,FF
1 1 1 0 <div>(<div>")abc"</div></div> FF
2 2 2 1 <div>(<div>"abc"</div>)</div> FF

(Spec as in the stated interpretation of the Chromium developers in Issue
637296, not withstanding the matter of empty text nodes.)

Notations:
- Text nodes are identified as strings "".
- The start of the range is identified by '(': inside a string the
  startContainer is the text node, outside it's the parent element.
- The end of the range is identified by ')'.

[*] Nominated for most creative bug of the year:
https://bugs.chromium.org/p/chromium/issues/detail?id=764841

```

Neither Chromium nor Firefox nor Safari follow the current spec, and they don't
agree with each other.

Also, please note that many of these ranges, while they have different
boundaries, effectively select the same text. Code that must work with
arbitrary ranges may sometimes be lucky and get one of the ranges that "works",
or get one of the effectively identical ones (stringifying the same) but that
"doesn't work".

Circumstantially, people who've had issues with this were building rich text
editors and needed to draw over rendered text or otherwise locate it on screen.
Such code would rather not want to care about the structure of the text nodes
(especially if using contenteditable), and would find it easiest to have the
ability of blindly writing something like:

```
 const selection = window.getSelection();
 if (selection.rangeCount !== 0) {
   const range = selection.getRangeAt(0);
   drawCaret(range.getClientRects()[0]); // WRONG: list could be empty
 }
```

By the way, CaretPosition.getClientRect() will similarly return null when the
caret range is not null and range.getClientRects() returns an empty list. The
CaretPosition interface does allow returning null here, but it's just very hard
to use if the return value depends on the exact text node split and degree of
normalization.

```
 const caret = document.caretPositionFromPoint(event.clientX, event.clientY);
 const rect = caret.getClientRect();
 drawCaret(rect); // WRONG: could be null
```

In any case, web browsers have been getting this wrong, so presumably,
clarifying the spec would at a minimum be helpful.


=== Suggested approach

Just to drive the discussion forward, I offer some specific wording below.

I address a couple additional questions: (i) when there is no relevant layout
boxes, (ii) text nodes that span multiple lines should return a rect for each
line that is selected or partially selected. As far as I can tell, Chromium and
Firefox already adhere to these.

Proposed updated Spec' (additions in bold):

> The getClientRects() method, when invoked, must return an empty DOMRectList
> object if the range is not in the document and otherwise a DOMRectList object
> containing a list of DOMRect objects in content order that matches the
> following constraints:
> 
> - For each element <b>with a layout box</b> selected by the range, whose
>   parent is not selected by the range, include the border areas returned by
>   invoking getClientRects() on the element.
> 
> - For each Text node <b>(including empty text nodes)</b> selected or
>   partially selected by the range (including when the boundary-points are
>   identical), <b>whose parent element has a layout box,</b> include one
>   DOMRect object <b>for each line box selected within the text node</b> (for
>   the part that is selected, not the whole line box). The bounds of these
>   DOMRect objects are computed using font metrics; thus, for horizontal
>   writing, the vertical dimension of each box is determined by the font
>   ascent and descent, and the horizontal dimension by the text advance width.
>   If the range covers a partial typographic character unit (e.g. half a
>   surrogate pair or part of a grapheme cluster), the full typographic
>   character unit must be included for the purpose of computing the bounds of
>   the relevant DOMRect.  [CSS-TEXT-3] The transforms that apply to the
>   ancestors are applied.
> 
> <b>If the list built above is empty, build and return the list that would be
> obtained by performing the following sequence of operation:
>
> <pre>
>  const empty = document.createTextNode();
>  range.insertNode(empty);
>  const rects = range.getClientRects();
>  empty.parentNode.removeChild(empty);
>  return rects;
> </pre></b>

Notes:

1. I reuse the term "layout box" used elsewhere in the spec, even though it is
   noted in Issue 1 that it is ill defined.

2. I expect that the formulation in terms of a virtual sequence of operations
   will raise some eyebrows: but it is well defined and easy to test against.

3. This approach rests on the assumption that empty text nodes should indeed
   produce a non-empty rect list (i.e. the current spec deliberately didn't
   make an exception for empty text nodes).
   
4. By using Range.insertNode() we insert the empty text node at the start of
   the range: this is an arbitrary choice.

5. With this definition, the lengths in the table above would read:
   1, 1,1,1, 1, 1,1,1, 1,1,1, 1,1,1,1,3, 1,1,2


== Performance considerations

As noted above, Chromium does not return a rect for selected empty text nodes.
While the spec doesn't allow for that, there may be a performance advantage to
their approach. Returning the rect for an empty text node is not necessarily
cheap: while the width is 0 [*], to compute the height of the rect, you need
the font ascent and descent, which may not be readily available in the text
style is different in this part of the document. Note that Firefox sometimes
returns zero-width and zero-height rects with a non-zero (x,y) in this kind of
case, maybe to avoid getting into font metrics.

[*] Chromium sometimes returns non-zero width rects for collapsed ranges in the
middle of non-empty text nodes, so maybe this is not so obvious.

The suggested wording requires obtaining font metrics in places where there is
actually no text node in the DOM, and maybe this requirement is unacceptable.

As a user of this API, it wouldn't bother me much if the client rect for a
range inside an empty text node (or merely when we pretend to insert an empty
text node) was allowed to have zero width and zero height (but with a non-zero
x and y): it would typically be easy enough to compute a reasonable height to
default to.


== Alternative

As an alternative, as long as empty text nodes are guaranteed to produce a
non-empty rects list, we couid expect the caller to actually (and manually) run
that sequence given above. But it will make the common case slower (I presume
few callers are going to be satisfied with an empty rects list).

Thanks

-- 
GitHub Notification of comment by ericrannaud
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/2514#issuecomment-387264578 using your GitHub account

Received on Tuesday, 8 May 2018 02:26:06 UTC