Re: Text selector [was Re: breaking overflow]

On Jan 3, 2010, at 9:02 PM, Boris Zbarsky wrote:

>>> What if "bar" is split over two adjacent textnodes in the DOM?
>> 
>> Forgive my ignorance, but when does that happen?
> 
> Simple example #1:
> 
> <!DOCTYPE html>
> <body>
> <script>
>  document.body.appendChild(document.createTextNode("ba"));
>  document.body.appendChild(document.createTextNode("r"));
> </script>

I see. I would treat that as if it were all one node, since there would seem to be little advantage to not doing so.

> Simple example #2:
> 
> <!DOCTYPE html>
> <body>
>  ba<!--comment, so the textnodes aren't even quite
>  adjacent in the DOM-->r

Hmm. It is a bit murkier, but I think that treating the comment as though it were utter nothingness and as if "bar" was all part of the same node would be the right thing to do.

> Simple example #3:
> <!DOCTYPE html>
> <body>
>  ba<script></script>r
> 
> (again, not even adjacent in the DOM; user-perceived as one word.

That is interesting. I think this would no longer count as a single node. Any element inserted between would no longer qualify the surrounding text node as being a single text node (and I'd just arbitrarily grant an exception for comments because of their nature and use). It's not really adjacent, as you say. Inserting an empty span element or an empty script element would have the same effect: to prevent the text from matching (maybe even on purpose and for no other reason, if the HTML author really wanted to prevent that presentational effect for some reason).

> Simple example #4, equivalent to the above:
> 
> <!DOCTYPE html>
> <body>
>  ba<script>document.write("r")</script>

That would be right out., given my answer to #3. Even without it, this is now far from being "two adjacent textnodes" in my mind.

> Example #5, that might depend on the exact parser algorithm used and might not ever lead to multiple textnodes in an HTML5 parser but I think does in some cases in existing parsers:
> 
> <!DOCTYPE html>
> <body>bar
> 
> with an HTTP packet boundary between the 'a' and the 'r'.

I hadn't heard of that before, but I don't think that should count, and that would be treated the same as a single text node. It's good of you to point out this sort of thing though, so that when we write a specification we can be clear about what we mean.

> Example #6, which depends on exact behavior still being hammered out in the HTML5 spec:
> 
> <!DOCTYPE html>
> <body><script>
>  document.write("ba"); document.write("r");
> </script>

I would treat that as a single node, since for all practical purposes that I can think of, it is the same as doing a 'document.write("bar")'.

> Example #7: editable content (designMode/contentEditable) can probably lead to random textnode boundaries as text is inserted, then removed, then edited, wrapped in tags, unwrapped from the tags, etc.  I don't think there's anything that specifies what the resulting DOM should look like on the individual textnode level yet.

For this one, I don't know. It is certainly an interesting question. It seems to me that at least nothing should happen until the content is no longer editable. It would be very strange if there was a "find and replace" type of thing going on constantly as you typed, due to using a 'content' replacement with the '::text()' pseudo-element (but maybe leave that implementation detail undefined). In the end, if a continuous string of text is created without tags inside it, then probably treat as a single text node.

> 
>> Can they be treated as one?
> 
> Probably yes.  The issue is deciding what cases to treat as one, in some ways....

I see that now. Thanks.

Received on Monday, 4 January 2010 06:47:00 UTC