Re: Proposal: parent selectors

On Thu, Jan 21, 2010 at 8:04 PM, Boris Zbarsky <bzbarsky@mit.edu> wrote:
> On 1/21/10 1:37 PM, Eduard Pascual wrote:
>>
>> On the extreme cases of mutations every 10ms
>
> It's not the extreme case... it's the common "animate stuff with js" case.
Oops, sorry, I read 10ns instead of 10ms which would actually be an
insanely extreme case.
For the common case of js-based animations, I'd expect <canvas> to
take over within the next few years (at least for new content, old
content will still be there). A good thing about <canvas> is that
drawing each new frame doesn't trigger any change in the DOM.

Of course, pages that use both :has() and "old-style" animations would
suffer serious performance impact; but I think it's a safe bet to
assume that <canvas> will become widely deployed far before than
:has(). And it shouldn't take too long before the web is populated
with tutorials and how-to's for <canvas> animations (G already yields
some millions of results for "html canvas animation"). So any author
that tests his pages would notice the slow-down with
:has()+"old-style" animations; and it wouldn't take too much research
to update it to use <canvas>.
In any case, any affirmation about the future is purely speculation;
so maybe we should wait a bit before evaluating this specific case.

>> I would expect the authors to know what they are doing
>
> They typically don't, actually.  Unfortunate, but true.
Most users who use javascript without knowing what they are doing are
using copy-paste; so most probably they are googling for the stuff to
paste on their pages. They can also google for help when they see that
thinks don't go smoothly.

>> It's a matter of common sense to test an
>> application during development. If a developer finds that an
>> application is going painfully slow, the usual thing to do is to
>> remove or polish the parts of it that cause the lag.
>
> Sort of.  The usual thing to do is to complain about browsers being slow.
If we speak about complex applications, I'd assume there are sane
developers behind them: not everything is at the reach of a newbie and
a soup of <font> tags spit out from {insert random obsolete authoring
tool here}.
Very few noobs write their own JS: they copy it from other sources,
and the actual authors of the code are likely to know better what
their code does.

>  Note, in particular, that this gives a leg up to browsers that happen to
> not implement certain features, or not handle dynamic changes properly
> because doing so would be "too slow".
Of course, it is a bad thing when implementing a feature puts the
implementer at a competitive disadvantage. That's why I'm trying to
figure out mechanisms to minimize that disadvantage; most prominently,
I think the impact of implementing this feature should be near to none
for pages that don't use it). For pages that do use the feature, it's
a matter of balancing the cost against the benefits to determine which
agent would be perceived as "best" (for example, if a user compares
two browsers, s/he might be happy to wait a little longer if the page
looks or works better enough... it's a matter of how longer s/he
waits, and how better the page looks or works).
For any *experimental* implementation, I'd suggest taking one of these
approaches:
- Add an option to the browser, off by default, to define whether this
(and possibly other experimental features) is used: this way
developers could begin trying the feature and testing pages using it;
but most users wouldn't see a difference neither on the sites nor on
the browser, until the feature becomes standarized.
- In the case of open-source UAs like Firefox/Gecko, implement the
experimental feature on a separate branch, and don't merge it into the
trunk version until it becomes a standard (AFAIK, that's how
experimental stuff is normally tried on "community" projects, isn't
it?)

>> In addition, on the context of dynamic applications, a parent selector
>> wouldn't be as needed as on scriptless context: it may always be
>> emulated by putting a class or event handler on the "parent" element,
>> and doing some tinkering through scripts.
>
> While true, in practice I fully expect it would be used.
Honestly, I can't reply to this. I'd really wish pages written too
badly to suck as much as they deserve, so those of us who take web
authoring seriously enough would have an edge on the field; but I
don't think there's much we can do about that :-(

>> For the case of thousands (or more) of updates at once, that's IMO a
>> matter of UA- and application- specific optimizations.
>
> In other words, "not my problem, your problem".  Thanks!  :)
That's *not* what I mean't. I was just trying to break down into the
two cases to deal into them separately.

>> On simple cases, UAs can be smart enough to notice a sequence of updates
>> so the
>> rendering is only refreshed once.
>
> UAs already do this.  The slow part with :has is not refreshing, in general,
> but figuring out which parts to refresh.
Allright. It's good to know where to focus the efforts to increase the
chances of this feature becoming a reality.

>> On more complex cases, a good idea could be to add some sort of API to let
>> the application warn the UA
>> about what's going to happen. I've been doing a lot of .Net coding
>> lately, so something as WinForms' SuspendLayout() and ResumeLayout()
>
> This has been proposed.  Unfortunately, the typical web app would have a
> strong tendency to not call ResumeLayout in various cases; often
> UA-dependent (based on my experiences debugging such things in the past).
>  Add to that that in the near term they'd be in try/catch blocks or
> feature-tested, and UAs that don't implement them actually have a
> competitive advantage (in the "stuff doesn't break in this browser" sense).
>  Therefore there's little incentive for UAs to implement them.
Missing calls to ResumeLayout would be a problem, but a solvable one.
>From the top of my head, the UA could "resume" rendering after some
delay without scripts running. The details of that would still need to
be sorted out; and maybe there are better solutions, but an example of
a solution is enough to prove that a solution exists.

>
>> I think of two forms of flagging; and I'll admit that both are based
>> on an assumption: this feature would normally be used on a small
>> subset of the rulesets and elements of a page.
>
> The former is probably a good assumption.  The latter may not be depending
> on what "used" means here....
I'm not sure I know what I meant with "used" there; but I know what
you mean. I think I meant "the author expects it to be matched under
some circumstances"; but I now realize that then the assumption is
useless (this is, it can't be exploited to optimize the matching and
updating process): I overlooked the difference between what an author
knows about the document and what the UA knows.
>
>> Also, I think it's reasonable for the cost of a feature to be proportional
>> or dependant
>> to how much it is used (if you disagree on this, I'd really like to
>> hear (or read) your opinion).
>
> I'm not sure what the statement means...  Are you saying that it's OK for a
> feature to cause all performance on the page to go down the tubes as long as
> not many pages use it?  Or something else?
Something else: it's ok for a feature to have a greater impact on a
page that abuses it as long as pages that don't use it or use it
sparingly are much less affected.
For example, it's ok for pages with dozens of :has() selectors to go
to be significantly slowed down; as long as pages that have are most
two or three selectors of this kind only suffer a slight impact. And
ideally, the impact on pages that don't use the feature at all should
be negligible.

>> Form 1: An array of pointers to the "deferred" rulesets (any ruleset
>> that uses this feature or, in the future, any other feature with
>> similar performance issues), and an array of pointers to the
>> "critical" elements for these rulesets. In the case of a "parent" or
>> "ancestor" selector, this would be the any element that matches the
>> part of the selector before ":has(".
>
> This latter list is quite likely to include a significant fraction of the
> document, as I said earlier in this thread.
>
>> With either form, changes within a "critical" element
>
> Determining wheter a change is withing a "critical" element is not
> necessarily particularly cheap, right?  Or requires a bunch of
> state-propagation in the DOM.  Or both.
That you should know better than me; since I don't know too much about
how the tree is represented internally within the engine.
It can be described in terms of event propagation: if the objects that
represent an element have a "changed" event and a "parent" property,
on the changed event the engine would test whether the element is
critical, or directly test it against the "deferred" selectors, and
then fire the parent's "changed" event.
Having read all the replies since my last mail (particularly yours),
it looks like the best form of flagging would be "Form 1" (array of
pointers/references/whatever) for the "deferred" rules, and "Form 2"
(including a boolean flag on the object itself) for the "critical"
elements. With that, on the event model described above, the time cost
of determining whether a change to element "e" is within a critical
element would be in the O(depth(e)) in the general case. IMO, given
the nature of the feature, I'm afraid it's probably the best that may
be achieved.

>> would require recomputing the "deferred" styles
>
> Meaning what?  Some set of nodes would need to have their styles recomputed.
>  What set?
Sorry, after re-reading my own words I realized that a blindfolded
monkey typing with the tongue could have expressed my idea better than
I have. Forget the phrase "changes within a "critical" element would
require recomputing the "deferred" styles, but other changes wouldn't
be affected". A change on a "critical" element would require
re-testing the "deferred" rulesets on that element and its
descendants: if a ruleset that didn't previously match not matches, or
vice-versa, the relevant changes on the style of the element should be
applied (I'm not sure what would this exactly imply, but it should be
comparable to what changing the element's @class would imply). I'm
already thinking of further optimizations to reduce the amount of
descendants that need to be tested; but I'll take a while to turn the
ideas into something concise, so I'll post them later or tomorrow.

>> In the abstract, it seems that these (or any other) form of
>> flagging would allow determining what needs to be updated much faster
>> than by matching again each selector against the document tree on
>> every DOM update.
>
> What _really_ needs to happen is to:
>
> 1)  Determine what elements might have changes in the set of rules
>    applying to them.
> 2)  For those elements, determine the new set of rules.
I hope the clarification above answered this:
1) upon a change on an element, it's "critical" ancestors and their
descendants "might" have changes. (As I said before, I already have in
mind some ideas that would allow reducing this set significantly).
2) For those elements, the new set of rules would be determined by
checking them again against the "deferred" selectors.

> To avoid full selector matching step 2 would need to just test the rules
> that might have changed or not changed and remove them from the lists or add
> them in the right places in the lists.   It would also need to invalidate
> any information that was cached in the rule lists (in Gecko's case, for
> example, a fair amount) and so forth.  It may be faster to do this than to
> just redo matching.  It might not be.  Hard to tell up front.  It would
> certainly not be cheap.
I don't know anything about the specifics of caching, so I can't speak
about that. In any case, only the selectors from the "deferred"
rulesets would need to be re-tested.

> As for the subset of the document affected (step 1 above), if we're talking
> about doing this for all descendants of "critical" elements then this will
> often be a big chunk of the overall DOM.
I have to say again that I'll post back soon with further mechanisms
to reduce the set of descendants to check.

>> (I'm not asking if this would be enough to
>> solve the issues, just if it would be enough to solve or mitigate part
>> of them.)
>
> Sure.  If this were implemented something like that would absolutely have to
> be done.  It would slow down normal selector matching and style computation,
> I suspect, and it would not necessarily help enough in common-enough
> :has/:matches cases.  But the only way to tell for sure is to spend some
> time implementing it and see, of course.
>
>> Again, please let me know: do you think my suggestion/idea about
>> flagging would help mitigating that cost?
>
> In some cases, yes.  In others, no.  In yet others it actually makes it more
> expensive than just restyling everything.
>
> The question is the relative frequency of those cases...
>
>>> Pretty much anything that involves spending more than 10ms on a set of
>>> DOM
>>> mutation is a hit, right (since it's directly detectable from script as a
>>> framerate fall)?
>>
>> How much of a hit would it be compared to a script that:
>> 1) Tweaks the loaded CSS replacing each instance of the :has()
>> selector with a new class name, and keeps a copy of the original in a
>> variable.
>> 2) Upon page load and upon each change of the document:
>>   2.1) Checks for each element to see if a :has() rule applies to it.
>>   2.2) If the rule would match, adds the relevant class name to the
>> @class attribute of the matched element, so it matches the tweaked CSS
>> rule.
>>   2.3) If the rule doesn't match AND the element contains the relevant
>> class, remove the class so it doesn't match the tweaked CSS rule
>> anymore.
>
> Much less, of course.  ;)
>
>> I'm asking this because, for as long as UAs don't provide this
>> feature, this is the only solution for the general case.
>
> Which means that no one uses the general case, especially not by accident,
> right?
Right. And that also means that those authors who need this kind of
checking go the "site-specific" way, which is extremely hard to keep
in sync with changes on the page. This kind of insane code is what
crowds the web with so many glitches, cross-browser inconsistencies.
In addition, so many authors reinventing (part of) the wheel for each
site is an overkill, and when totalized translates into many hours of
work that would have been otherwise invested into providing more and
better features for the end-user.

The fact is that there is a rising demand for a descendant selector,
exemplified by the amount of times this request has been brought to
discussion on this list (and a need, proved by the use-cases brought
forward into the discussions). Sooner or later, this will get into
CSS. It is true that it has a significant impact on performance, and
it's an implementation challenge to mitigate it (as well as a task for
authors wanting to have this sooner rather than later to help
implementers as much as we can to overcome that challenge).

I'm aware some of my mails may seem "aggressive" or hostile; but my
only goal is to help improving the Web. And, on the discussion about
the descendant selector, this involves trying my best to find a
solution for the implementation issues this feature would arise. I'd
like to apologize for the tone of my first mail in this thread: I got
too heated the last time this got discussed; and bringing back the
topic has also brought back the frustration from then. Honestly, I
think we are making more progress than the last time I participated on
a discussion on this topic; which is IMO a good thing.

Oh, and I also think it's a good time to thank you, Boris, for your
participation: I'm convinced the insight you have provided on the
specifics of CSS processing (at least on Gecko) are really helpful to
get closer to a solution: I bet I'm not the only one on these lists
with experience in programming in general but not in CSS rendering
engines; so we can contribute abstract ideas but we rely on guidance
like yours so we don't go too "blind" on the field.

Regards,
Eduard Pascual

Received on Thursday, 21 January 2010 23:45:09 UTC