- From: James Hopkins <james@idreamincode.co.uk>
- Date: Mon, 4 Jan 2010 12:15:36 +0000
- To: Boris Zbarsky <bzbarsky@MIT.EDU>
- Cc: Brad Kemper <brad.kemper@gmail.com>, www-style list <www-style@w3.org>
# I personally can't envisage a use case where crossing
# textnodes (or element boundaries, for that matter) in
# order to match a single word, would be beneficial.
Looks as though I was being far too narrow-minded when making such a
sweeping statement :)
Having said that, it's easy to see how earlier discussions on such a
text selector (if indeed there were any), would have been blighted by
treating textnode/element boundaries differently in varying scenarios,
and I'm slightly unsure as to why we're considering doing the same
here. I'm of the opinion that we should consider adopting a 'one size
fits all' logic that treats all textnodes in the same way, independent
of their context - whether this means having the ability to cross the
boundaries of textnodes or not.
My thoughts on the following scenarios:-
> Simple example #1:
>
> <!DOCTYPE html>
> <body>
> <script>
> document.body.appendChild(document.createTextNode("ba"));
> document.body.appendChild(document.createTextNode("r"));
> </script>
The authors intention was to create two separate textnodes, thus they
should be treated separately.
> Simple example #2:
>
> <!DOCTYPE html>
> <body>
> ba<!--comment, so the textnodes aren't even quite
> adjacent in the DOM-->r
An HTML comment should be treated similarly to an element node, in
that it acts as delimiter which splits the surrounding text into two
textnodes.
> Simple example #3:
> <!DOCTYPE html>
> <body>
> ba<script></script>r
>
> (again, not even adjacent in the DOM; user-perceived as one word.
See above.
> Simple example #4, equivalent to the above:
>
> <!DOCTYPE html>
> <body>
> ba<script>document.write("r")</script>
See the conclusion near the end of this email for details.
> Example #5, that might depend on the exact parser algorithm used and
> might not ever lead to multiple textnodes in an HTML5 parser but I
> think does in some cases in existing parsers:
>
> <!DOCTYPE html>
> <body>bar
>
> with an HTTP packet boundary between the 'a' and the 'r'.
I'm unsure as to what this is :)
> Example #6, which depends on exact behavior still being hammered out
> in the HTML5 spec:
>
> <!DOCTYPE html>
> <body><script>
> document.write("ba"); document.write("r");
> </script>
They're written as two separate entities, so should be treated as such.
> Example #7: editable content (designMode/contentEditable) can
> probably lead to random textnode boundaries as text is inserted,
> then removed, then edited, wrapped in tags, unwrapped from the tags,
> etc. I don't think there's anything that specifies what the
> resulting DOM should look like on the individual textnode level yet.
In conclusion, it makes sense to me that all adjacent textnodes are
treated as separate entities; that is ::text() is excluded from
crossing adjacent textnode boundaries in order to match a single word.
In contrast, the selector _should_ be able to cross one side of an
element's boundary (start-tag or end-tag), so as to match an element's
sibling textnode with any descendant textnodes (with no whitespace
inbetween) of that element which appear in the document source (which
successfully excludes example #4).
As an aside, I believe it would also be beneficial to have the ability
to include multiple strings in the same selector, something
like ::text("foo bar", "test text").
Received on Monday, 4 January 2010 12:16:08 UTC