Re: Text selector [was Re: breaking overflow] from James Hopkins on 2010-01-04 (www-style@w3.org from January 2010)

From: James Hopkins <james@idreamincode.co.uk>
Date: Mon, 4 Jan 2010 18:28:44 +0000
To: Tab Atkins Jr. <jackalmage@gmail.com>
Cc: Boris Zbarsky <bzbarsky@mit.edu>, Brad Kemper <brad.kemper@gmail.com>, www-style list <www-style@w3.org>
Message-Id: <F819EDAA-BE34-459C-BA38-E7F970849CC1@idreamincode.co.uk>
> On Mon, Jan 4, 2010 at 6:15 AM, James Hopkins <james@idreamincode.co.uk 
> > wrote:
>> In conclusion, it makes sense to me that all adjacent textnodes are  
>> treated
>> as separate entities; that is ::text() is excluded from crossing  
>> adjacent
>> textnode boundaries in order to match a single word. In contrast, the
>> selector _should_ be able to cross one side of an element's boundary
>> (start-tag or end-tag), so as to match an element's sibling  
>> textnode with
>> any descendant textnodes (with no whitespace inbetween) of that  
>> element
>> which appear in the document source (which successfully excludes  
>> example
>> #4).
>
> I really don't see how these two statements are consistent.  How is it
> okay to treat adjacent text nodes as separate, but a text node
> followed by an element containing a text node as together?

I recognize I've been a bit confusing through this entire discussion,  
and after re-reading my last email, I'm not entirely sure why I made  
such a fuss about element boundary matching. One use case I envisaged  
was that something like:-

::text("foobar") which would match "foo<span>bar</span>", but it's an  
unlikely case at best, and I don't think would be worth matching if  
it's going to cause complications as you describe.

> I also don't like basing anything on document source.  It's much
> better to pay attention to the DOM.  I don't think CSS even has access
> to the document source - it's a separate layer in applications, given
> to the parser which then produces a DOM for other things to access.

I recognize this was an error on my part, and concur that accessing  
the document source is not a good solution (if even feasible).

> As well, the example #5 makes *no* sense to treat as separate.  It's
> together in the source, it's together in the display, it's only the
> DOM that might have them separate, due to the vagaries of packets.
> Making exceptions for these situations is no good.

As I said in my last email, I am unclear as to what an HTTP packet  
boundary is. If you could provide me with a link which provides me  
with some more information, that would be appreciated so I know for  
future reference.

> Overall, we need to be *consistent*.  Having rules based sometimes on
> source, sometimes on the DOM, sometimes treating adjacent nodes as
> separate, and sometimes treating them as together, will just be
> ridiculously confusing for authors, not to mention likely bug-ridden
> in implementations.  The simplest rule that covers reasonable cases is
> that it matches on the DOM, and adjacent text nodes are treated as
> being together for the purpose of matching.

To clarify, you're suggesting that ::text("foobar") would match the  
string "foo bar"?

> I still don't support
> matching across element boundaries, as it will work in a non-intuitive
> manner.  For example, the following code:
>
> <p>foo <i>bar</i> baz</p>
> p::text("foo b") { display: block; }
>
> Would result in the following display:
>
> foo
> b
> ar baz

The pseudo element would be restricted to applying a limited range of  
properties, similar to ::first-line or ::first-letter.

> Rather than deal with this sort of confusion, I think it's better to
> just say that you can't cross element boundaries at all.  This will
> still result in confusion sometimes (particularly when the text has an
> otherwise-invisible <span> in it or the like), but it will be lesser,
> and easier to understand.  (I think explaining the way pseudoelements
> get broken up into multiple pieces and how that effects some
> properties is much more difficult than explaining "all the text has to
> be together in the element to match".)
>
> This means that for Boris's examples, 1, 5, and 6 would match, 3 and 4
> would not.  I'm not sure what I feel is best for 2 - I could go either
> way, whichever seems more consistent.  7 is difficult, but it's going
> to be difficult no matter what.  I guess I'm fine with just
> recommending not applying ::text to a contentEditable piece, or
> accepting that the results will be non-intuitive.  The main thing I'm
> afraid of is text being wrapped in tags, then all the text being
> deleted, but the tags still sticking around with no textual content.
> That'd break a ::text selector without the user knowing what's wrong.
> On the other hand, deleting that segment of text and retyping it would
> fix it.  Shrug.
>
>> As an aside, I believe it would also be beneficial to have the  
>> ability to
>> include multiple strings in the same selector, something  
>> like ::text("foo
>> bar", "test text").
>
> Yeah, I can see the use for that.  I'd consider it necessary, in fact,
> now that you mention it.

I think I'm going to bow out of this discussion for a bit - at least  
until I get some clearer thoughts in my head, before putting 'pen to  
paper'.
Received on Monday, 4 January 2010 18:29:17 UTC