Re: Text selector [was Re: breaking overflow] from Tab Atkins Jr. on 2010-01-07 (www-style@w3.org from January 2010)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Wed, 6 Jan 2010 18:01:14 -0600
To: Brad Kemper <brad.kemper@gmail.com>
Cc: "robert@ocallahan.org O'Callahan" <robert@ocallahan.org>, Boris Zbarsky <bzbarsky@mit.edu>, www-style list <www-style@w3.org>
Message-ID: <dd0fbad1001061601s71f52eb7jfbd97091ef634e68@mail.gmail.com>
On Wed, Jan 6, 2010 at 12:42 PM, Brad Kemper <brad.kemper@gmail.com> wrote:
> On Jan 6, 2010, at 8:39 AM, Tab Atkins Jr. wrote:
>> On Tue, Jan 5, 2010 at 5:40 PM, Brad Kemper <brad.kemper@gmail.com> wrote:
>>>> or
>>>> p::text('ABCD') { color:red; }
>>>> p::text('CDEF') { color:blue; }
>>>> <p>ABCDEF</p>
>>>
>>> Tab and I resolved this to our own general satisfacion, I thought, in the
>>> earlier part of the thread. "AB" would be red and "CDEF" would be blue.
>>
>> No, after further thought this isn't resolved properly.  What's the
>> behavior if both ::text() selectors applied display:block; to the
>> content?  Colors override each other nicely; display changes do not.
>> You'd end up with precisely the problem I was decrying earlier about
>> allowing it to match across element boundaries - it would display as:
>>
>> AB
>> CD
>> EF
>>
>> This would be very confusing to authors.
>
> I don't think it is any more confusing than most other CSS specs, where a clear understanding of the spec helps one understand what happens with edge cases.

Very few people have a clear understanding of the spec, and the
details of how implied elements nest and break each other is
non-trivial to understand.

> Or, if technically feasible, we could just further refine our definition, so that properties that are identical between two adjacent ::text() pseudo-elements (the first two of the three you created, for example) are treated as though they belonged to the same element for the purposes of that property. Then, in your example you'd get the (perhaps) more intuitive version of this:
>
> AB
> CDEF

While I don't think your exact suggestion is feasible, there's a good
gem of an idea in there.  We already say that ::text can't match
across an element boundary.  The problems we're running into now are
because ::text nodes are matching across *other* ::text boundaries
(possibly from the same rule, with the ABABABAB and ::text("ABAB")
example).  If we can come up with a simple, intuitive way to fix this
we should be okay, and what you're suggesting (/what we agreed on
earlier) points us to just such a solution.

Basically, no text can be part of multiple ::text pseudoelements.
Whenever multiple rules would match something, the later/more powerful
::text gets what it wants, reducing the amount that other rules are
allowed to match.  They still *do* match, they just don't wrap as much
(possibly none).

So, in the example we're using:

<p>ABCDEF</p>
::text("ABCD") { display: block; }
::text("CDEF") { display: block; }

You'd first make the ABCD match, producing this pseudostructure:

<p><text match=ABCD>ABCD</text>EF</p>

And then match the CDEF one, which wins because it comes later in the
document and thus beats it according to normal cascading rules,
producing this pseudostructuree:

<p><::text match=ABCD>AB</::text><::text match=CDEF>CDEF</::text></p>

Which would then display as:

AB
CDEF

I think this resolves the issues we've been having, and is furthermore
pretty simple to understand (implementation-wise I dunno).  Only one
::text matches a character at a time, and the relevant selector is
determined by normal cascading preference.

This does have some unintended downsides - for one, it's impossible to
have nested ::text.  Frex, this wouldn't work as you'd expect:

<p>ABCDEF</p>
::text("CD") { font-weight: bold; }
::text("ABCDEF") { font-style: italic; }

The first rule would match, and then the second would fully override
it, making all the text italic and none of it bold.  If you reversed
the order of the rules, I think you'd end up with AB and EF italic,
and CD just bold, as the CD rule steals the matches in the middle and
splits the original ABCDEF match into two.  It's possible we could
special-case pure nesting, but then we might run into problems.  What
if we're trying to match CD, CDEF, and ABC?  The order in which they
exist would change things around.
1) If ABC came last, then it would match ABC, while the CD rule
matched just D and the CDEF rule matched just DEF.    You'd have
<p><::text match=ABC>ABC</::text><::text match=CDEF><::text
match=CD>D</::text>EF</::text></p>.
2) If ABC came first, the it would match AB, while CD would match CD
and CDEF would match CDEF.
3) If the order was CDEF,ABC,CD, you'd first match CDEF with CDEF,
then ABC would steal the C, so that when CD tried to match it would
find it wasn't nested in anything and would steal both of its letters.
 You'd end up with <p><::text match=ABC>AB</::text><::text
match=CD>CD</::text><::text match=CDEF>EF</::text></p>.

This is probably too confusing.

Instead we could go with first-come-first-serve (or rather,
most-powerful-first-serve, to allow proper cascading control), and
then just treat more powerful ::text pseudoelements the same as normal
elements, preventing matching across them.  This would still allow
nesting, but it would prevent partial intersections.  If you tried to
match both ABCD and CDEF, whichever one was stronger in the cascade
would match, and would subsequently prevent the other one from
matching at all.  This is likely the best - it has fairly simple
behaviors and allows the primary nesting need.

> Or how about if I want to create a user style sheet that highlights my name whenever it appear on any page? That's a perfectly reasonable (albeit egotistical, arguably) styling desire. But it is not reasonable for me to ask every site author on the Web that might mention my name to also put it in bold.

Of course not.  But it's a pretty trivial javascript to make it
happen, installable as a userscript or a greasemonkey script or
whatever method the browser has to allow user-defined javascript to
run on a page.

>>>>> Here is a way to use a mono-spaced font for numbers:
>>>>>
>>>>> p::text("\d+") { font-family: "Courier New", Courier, monospace; }
>>
>> This one is indeed a pure styling issue.  It could perhaps be
>> addressed another way, though, by having something targetting
>> number-styling directly, either as a specialized selector or as a
>> number-style property or similar.
>
> So every time I want to arbitrarily style something that doesn't already have a tag in HTML, I need to try to convince the HTML working group to implement a new type of tag that somehow matches some meaningful semantics for my choice, and then wait and hope they do it? Or else, hope I have some control over the markup, and insert a completely non-semantic SPAN, as we all do now? With ::text() we have an opportunity to go far beyond that limitation, to much more powerfully style the pages that are given to us (which, in the case of user style sheets are all the pages of the entire Web).

Well, no, you'd have to convince the CSSWG.  ^_^  The point is that
some things are best addressed directly, rather than by a general
solution.

> They are if they are in the markup I'm styling. This proposal isn't an argument about how to best mark things up. If that example really bothers you, then pick another one. Suppose I want to replace every copyright symbol with a exactly sized raster image that my legal department insists on, but still have to have the regular one for people with style sheets off or whatever. Or maybe I want to replace all my periods at the end of sentences with little peace signs or hearts. It may be silly styling, but as a CSS author, I should not have to seek approval from a WG in order to do it. The WG should imbue me with the power to to style the markup I am using, whether they like the markup choices I made or not, if it is technically feasible to do so.

I don't believe CSS should be a general page-transformation language.
Some forms of transformation should be within our remit, such as
gaining source-independence in layouts, but not everything.  To
achieve full power would require CSS to become a full programming
language, which I think destroys what simplicity we have left.

(For the copyright issue, that's trivial: <img src=copyright.image
alt=©>.  Why so many people instinctively avoid <img>s I'll never
understand.)

>>>>> Here is a way to quickly strike out your old phone number throughout the
>>>>> site, until you've had a chance to update it, or in case you suspect that it
>>>>> is still on some of the hundreds of pages you don't control but provide CSS
>>>>> files for:
>>>>>
>>>>> ::text("\(555\) 555-5555") { text-decoration: line-through; }
>>
>> This is best accomplished with a search-and-replace across your
>> website.  If you don't have access to some pages, you have a problem
>> that needs to be solved *there*, rather than being hacked around with
>> a CSS backdoor.
>
> It's not hacked. I know I will need to search and replace, and work with my vendors, etc. But just as I might sometimes want to display a "this site is temporarily down" message, or "the page you are looking for is no longer available" message, I might want to use styling as a quick and temporary way to indicate that there is a problem with this bit of info is not valid.

And when you decide that a search-and-replace is harder than just
using ::text("(555) 555-5555") { content: "123-456-7890"; }, and/or
have a boss that decides the 5 seconds it would take you to do so is
better than the maybe-hours you'd have to spend getting access to all
the appropriate servers to do the replacement?  We of course can't
protect authors from themselves, but once you start down the slope of
communicating semantic information (like the fact that a phone number
is invalid) in styling, you're in trouble.

(Of course, we could just put in a restriction that ::text rules don't
allow the content property.  That would head off the worst abuses, at
the cost of being somewhat arbitrary.)

>> Now that it's been pointed out that my main use-case for ::text
>> (fancy, non-semantic styling of page headers) can be addressed equally
>> well by using generated content, and without any of the inherent
>> ambiguities of a full-powered ::text,
>
> Ambiguities can be resolved. The kludginess of that solution is worse, IMO.

Sure, that's very possible.

>> I'm seeing a lot less use for
>> it.
>
> I made the proposal not just to solve that one problem on your one site, but to be a powerful aid to styling. I just made up these examples on short notice, but I am frequently wishing for it in my work, and truly believe it would be very, very useful in a very broad way. Saying it's not useful is like saying that styling a SPAN is not useful, so we shouldn't allow CSS to affect SPAN elements.

Of course you suggested it for more than that one case, but a solution
is defined by its use-cases.  If some of the use-cases are also solved
elsewhere, then the urgency of implementing that particular solution
is reduced.

~TJ
Received on Thursday, 7 January 2010 00:01:45 UTC