- From: James Hopkins <james@idreamincode.co.uk>
- Date: Mon, 4 Jan 2010 12:15:36 +0000
- To: Boris Zbarsky <bzbarsky@MIT.EDU>
- Cc: Brad Kemper <brad.kemper@gmail.com>, www-style list <www-style@w3.org>
# I personally can't envisage a use case where crossing # textnodes (or element boundaries, for that matter) in # order to match a single word, would be beneficial. Looks as though I was being far too narrow-minded when making such a sweeping statement :) Having said that, it's easy to see how earlier discussions on such a text selector (if indeed there were any), would have been blighted by treating textnode/element boundaries differently in varying scenarios, and I'm slightly unsure as to why we're considering doing the same here. I'm of the opinion that we should consider adopting a 'one size fits all' logic that treats all textnodes in the same way, independent of their context - whether this means having the ability to cross the boundaries of textnodes or not. My thoughts on the following scenarios:- > Simple example #1: > > <!DOCTYPE html> > <body> > <script> > document.body.appendChild(document.createTextNode("ba")); > document.body.appendChild(document.createTextNode("r")); > </script> The authors intention was to create two separate textnodes, thus they should be treated separately. > Simple example #2: > > <!DOCTYPE html> > <body> > ba<!--comment, so the textnodes aren't even quite > adjacent in the DOM-->r An HTML comment should be treated similarly to an element node, in that it acts as delimiter which splits the surrounding text into two textnodes. > Simple example #3: > <!DOCTYPE html> > <body> > ba<script></script>r > > (again, not even adjacent in the DOM; user-perceived as one word. See above. > Simple example #4, equivalent to the above: > > <!DOCTYPE html> > <body> > ba<script>document.write("r")</script> See the conclusion near the end of this email for details. > Example #5, that might depend on the exact parser algorithm used and > might not ever lead to multiple textnodes in an HTML5 parser but I > think does in some cases in existing parsers: > > <!DOCTYPE html> > <body>bar > > with an HTTP packet boundary between the 'a' and the 'r'. I'm unsure as to what this is :) > Example #6, which depends on exact behavior still being hammered out > in the HTML5 spec: > > <!DOCTYPE html> > <body><script> > document.write("ba"); document.write("r"); > </script> They're written as two separate entities, so should be treated as such. > Example #7: editable content (designMode/contentEditable) can > probably lead to random textnode boundaries as text is inserted, > then removed, then edited, wrapped in tags, unwrapped from the tags, > etc. I don't think there's anything that specifies what the > resulting DOM should look like on the individual textnode level yet. In conclusion, it makes sense to me that all adjacent textnodes are treated as separate entities; that is ::text() is excluded from crossing adjacent textnode boundaries in order to match a single word. In contrast, the selector _should_ be able to cross one side of an element's boundary (start-tag or end-tag), so as to match an element's sibling textnode with any descendant textnodes (with no whitespace inbetween) of that element which appear in the document source (which successfully excludes example #4). As an aside, I believe it would also be beneficial to have the ability to include multiple strings in the same selector, something like ::text("foo bar", "test text").
Received on Monday, 4 January 2010 12:16:08 UTC