- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Thu, 7 Jan 2010 08:00:23 -0600
- To: Brad Kemper <brad.kemper@gmail.com>
- Cc: "robert@ocallahan.org O'Callahan" <robert@ocallahan.org>, Boris Zbarsky <bzbarsky@mit.edu>, www-style list <www-style@w3.org>
On Thu, Jan 7, 2010 at 1:06 AM, Brad Kemper <brad.kemper@gmail.com> wrote: > On Jan 6, 2010, at 9:49 PM, Tab Atkins Jr. wrote: >> Right, so one of the major problems is the misnested boxes that can >> occur. We want to allow nested ::text matches, but misnested matches >> are a problem. Dealing with them naively results in the unintuitive >> and undesirable behavior I pointed out before. How does my suggestion >> for most-powerful-matched-first sound for fixing this, and making >> previously matched ::text pseudos count as element boundaries just >> like a real element when matching later/less powerful ::text pseudos? > > I have to read it again in the morning with fresher eyes. My initial reaction to the first part was good, but then I started getting lost, mostly due to my attention span at this time of day. But, how about this for a simple way of saying what I think we both intuit to be write in the example: > > Follow normal cascading rules for each matched character, as though you were creating individual pseudo-boxes for each character, but adjacent character boxes that have the same pseudo-boxes because of the same rule get merged together into one box after all the text of the element has been otherwise resolved. > > I'll re-read your details again in the morning to see if this made more sense of it or less, or if I am still missing something, but I wanted to throw it out there this way while I was still awake. I can't tell if that's the same or not, but if confuses me anyway. Stated hopefully more clearly, my attempted resolution is this: apply ::text rules in cascading order (strongest first). Previously-applied ::text rules act like normal elements to later ::text rules, preventing matching across their boundaries. So in the old example of <p>ABCDEF</p> ::text("ABCD") { color: red; } ::text("CDEF") { color: blue; } The second rule would apply first, as it's stronger according to the cascade. This would produce the pseudostructure <p>AB<text>CDEF</text></p>. Then the first rule would try to apply, but since that would require matching across an element boundary (the text pseudoelement), it fails. As well, matching is defined to happen greedily and in the logical direction of text. This means that in the following: <p>ABABABAB</p> ::text("ABAB") { border: 1px solid black; } You wind up with exactly two things wrapped in a border, producing the pseudostructure <p><text>ABAB</text><text>ABAB</text></p>. The middle ABAB (characters 3-6) in the text attempted to match, but was blocked because the first ABAB (characters 1-4) had already matched, creating an element boundary. But then the last ABAB (characters 5-8) was free to match normally. This allows nested matches. For example: <p>ABCDEF</p> ::text("ABCD") { color: red; } ::text("CD") { font-weight: bold; } Would result in ABCD being red, and CD being bold, with this pseudostructure: <p><text>AB<text>CD</text></text>EF</p>. Note, though, that this last one might require a slight change in what we were saying about "element boundaries". Right now, the following: <p>AB<i>CD</i>EF</p> ::text("ABCD") { font-weight: bold; } will fail to match. But that's precisely the situation that the previous example (about nested ::text()s) creates, as the CD matches first, followed by the ABCD. We either have to allow things to match across element boundaries as long as they don't *misnest*, or have to accept that order is important in some silly ways. Since I think misnesting was the whole problem with crossing element boundaries in the first place, I assume this is okay? Or does it run into the same performance ratholes that ::contains did? Now, this rule still produces some possible implementation problems. For example, take this: <p>ABCDEF</p> ::text("ABC") { color: red; } ::text("CDF") { color: blue; } As the CSS engine eats the text, it notes an A and figures the "ABC" might be matching. It then sees a B and C. This would finish the "ABC" match, but it would also start the "CDF" match, which is more powerful and should win. The parser has to wait until it consumes a D (still matches) and then an E (no match) to realize that "CDF" won't match here and "ABC" should be able to apply, at which point it can run back and apply the pseudoelement and then emit the D and E. I'm assuming a particular kind of processing here, which may not apply in actual browser engines. Let me know if I'm way off-base. If I'm on the right track, though, is this a relatively large problem? Is there another way to arrange matching to make it easier while retaining the ability to nest but not misnest? Perhaps actual first-come-first-serve matching, rather than matching the most powerful first? If this isn't really a significant issue, or at least not one that will be significantly changed with a different matching algorithm, then we don't need to change anything. ~TJ
Received on Thursday, 7 January 2010 14:00:55 UTC