Re: Select elements based on text content

On 6 April 2010 08:18, Tab Atkins Jr. <jackalmage@gmail.com> wrote:

[…] I like it, and think we probably have enough to begin writing it up.  It
> won't go into the current Selectors draft, though, as that's in PR which is
> definitely feature-frozen.  But it's a good candidate for the next level of
> Selectors, I think.


> Some questions:
>
> 1. I presume that we don't want things other than actual child elements to
> prevent an element from matching a text selector?  That is, comments in the
> HTML source shouldn't prevent it from matching, right?
>
> 2. Does whitespace matter?  What if it's collapsed?  What if it's not?  I
> suspect we'd want to match against the whitespace-collapsed text always; I
> don't like the idea of matching or not based on the value of the white-space
> property.  We should probably also be forgiving of preceding and trailing
> whitespace.  We can nail this down more precisely when we write it up, but
> it's good to get feedback on what's reasonable for implementations.


That's great to hear, Tab! Thanks for your reply.

As I understand it your first point relates to situations such as this:

    <p>Paragraph text.<!--HTML comment--></p>

I agree that HTML comments should be ignored and that as far as this
selector is concerned the paragraph node above is childless.

I agree, also, with your inclination to perform matches against the
whitespaced-collapsed text, although we must first decide whether the
selector will match each element whose text content *exactly* matches the
specified string or each element whose text content *contains* the specified
string.

Consider a simple example: a table primarily containing yeses and noes.
Perhaps we wish to set the color of the "yes" cells to green, and the color
of the "no" cells to red. What would happen in this case?

    <td>Yes</td>
    <td>Yes</td>
    <td>No</td>
    <td>Yes</td>
    <td>No information available</td>

We'd almost certainly want to style the third and fifth cells differently,
so we'd need one of the following to be true:

   - the selector is concerned with exact matches only
   - the selector accepts regular expressions – ^No$
   - the selector matching "No" is followed in the style sheet by a selector
   matching "No information available"

The third option is particularly unattractive, as it blurs the line between
content and presentation. If another cell were added to the table with the
text content "No documentation exists", another exception would need to be
added to the style sheet. It is important to bear this in mind, as it
suggests to me that the selector should only be used in cases where the text
content is unlikely to change:

    <th>Firefox</th>
    <th>Google Chrome</th>
    <th>Internet Explorer</th>
    <th>Opera</th>
    <th>Safari</th>

The selector could be used to "replace" (through a combination of
background-image, line-height, and overflow) each of these headings with the
appropriate application icon.

I think that incorporating regular expressions would be a mistake, so my
preferred solution to the "No" / "No information available" problem is to
have the selector perform exact matches. Looking at the previous example,
though, it's easy to see the limitations of such a selector:

    <th>Firefox 3.6</th>
    <th>Google Chrome 5.0</th>
    <th>Internet Explorer 8.0</th>
    <th>Opera 10.5</th>
    <th>Safari 4.0</th>

In this case it would be really useful to be able to match all th elements
containing "Firefox". This suggests to me that two selectors are required:
one for exact matches and one for partial matches. Something along the lines
of…

   - td:equals("No")
   - th:contains("Firefox")

:contains() was used earlier in this thread by Boris Zbarsky, and :equals()
reads well and is indicative of its function.

I look forward to hearing others' thoughts.

David

Received on Tuesday, 6 April 2010 02:43:20 UTC