Re: QSA, the problem with ":scope", and naming

On Wed, Oct 19, 2011 at 7:22 PM, Ojan Vafai <ojan@chromium.org> wrote:
> On Wed, Oct 19, 2011 at 7:07 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> On Tue, Oct 18, 2011 at 9:42 AM, Alex Russell <slightlyoff@google.com>
>> wrote:
>> > Lachlan and I have been having an...um...*spirited* twitter discussion
>> > regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
>> > and ":scope". He asked me to continue here, so I'll try to keep it
>> > short:
>> >
>> > The rooted forms of "querySelector" and "querySelectorAll" are
>> > mis-designed.
>> >
>> > Discussions about a Scoped variant or ":scope" pseudo tacitly
>> > acknowledge this, and the JS libraries are proof in their own right:
>> > no major JS library exposes the QSA semantic, instead choosing to
>> > implement a rooted search.
>> >
>> > Related and equally important, that querySelector and querySelectorAll
>> > are often referred to by the abbreviation "QSA" suggests that its name
>> > is bloated and improved versions should have shorter names. APIs gain
>> > use both through naming and through use. On today's internet -- the
>> > one where 50% of all websites include jQuery -- you could even go with
>> > element.$("selector") and everyone would know what you mean: it's
>> > clearly a search rooted at the element on the left-hand side of the
>> > dot.
>> >
>> > Ceteris peribus, shorter is better. When there's a tie that needs to
>> > be broken, the more frequently used the API, the shorter the name it
>> > deserves -- i.e., the larger the component of its meaning it will gain
>> > through use and repetition and not naming and documentation.
>> >
>> > I know some on this list might disagree, but all of the above is
>> > incredibly non-controversial today. Even if there may have been
>> > debates about scoping or naming when QSA was originally designed,
>> > history has settled them. And QSA lost on both counts.
>> >
>> > I therefore believe that this group's current design for scoped
>> > selection could be improved significantly. If I understand the latest
>> > draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
>> > correctly, a scoped search for multiple elements would be written as:
>> >
>> >   element.querySelectorAll(":scope > div > .thinger");
>> >
>> > Both then name and the need to specify ":scope" are punitive to
>> > readers and writers of this code. The selector is *obviously*
>> > happening in relationship to "element" somehow. The only sane
>> > relationship (from a modern JS hacker's perspective) is that it's
>> > where our selector starts from. I'd like to instead propose that we
>> > shorten all of this up and kill both stones by introducing a new API
>> > pair, "find" and "findAll", that are rooted as JS devs expect. The
>> > above becomes:
>> >
>> >   element.findAll("> div > .thinger");
>> >
>> > Out come the knives! You can't start a selector with a combinator!
>> >
>> > Ah, but we don't need to care what CSS thinks of our DOM-only API. We
>> > can live and let live by building on ":scope" and specifying find* as
>> > syntactic sugar, defined as:
>> >
>> >  HTMLDocument.prototype.find =
>> >  HTMLElement.prototype.find = function(rootedSelector) {
>> >     return this.querySelector(":scope " + rootedSelector);
>> >   }
>> >
>> >   HTMLDocument.prototype.findAll =
>> >   HTMLElement.prototype.findAll = function(rootedSelector) {
>> >     return this.querySelectorAll(":scope " + rootedSelector);
>> >   }
>> >
>> > Of course, ":scope" in this case is just a special case of the ID
>> > rooting hack, but if we're going to have it, we can kill both birds
>> > with it.
>> >
>> > Obvious follow up questions:
>> >
>> > Q.) Why do we need this at all? Don't the toolkits already just do
>> > this internally?
>> > A.) Are you saying everyone, everywhere, all the time should need to
>> > use a toolkit to get sane behavior from the DOM? If so, what are we
>> > doing here, exactly?
>> >
>> > Q.) Shorter names? Those are for weaklings!
>> > A.) And humans. Who still constitute most of our developers. Won't
>> > someone please think of the humans?
>> >
>> > Q.) You're just duplicating things!
>> > A.) If you ignore all of the things that are different, then that's
>> > true. If not, well, then no. This is a change. And a good one for the
>> > reasons listed above.
>> >
>> > Thoughts?
>>
>> I like the general idea here. And since we're changing behavior, I
>> think it's a good opportunity to come up with shorter names. Naming is
>> really hard. The shorter names we use, the more likely it is that
>> we're going to break webpages which are messing around with the
>> prototype chain and it increases the risk that we'll regret it later
>> when we come up with even better functions which should use those
>> names. Say that we come up with an even better query language than
>> selectors, at that point .find will simply not be available to us.
>>
>> However, it does seem like selectors are here to stay. And as much as
>> they have shortcomings, people seem to really like them for querying.
>>
>> So with that out of the way, I agree that the CSS working group
>> shouldn't be what is holding us back. However we do need a precise
>> definition of what the new function does. Is prepending ":scope " and
>> then parsing as a normal selector always going to give the behavior we
>> want? This is actually what I think we got stuck on when the original
>> querySelector was designed.
>>
>> So let's get into specifics about how things should work. According to
>> your proposal of simply prepending a conceptual ":scope" to each
>> selector group, for the following DOM:
>>
>> <body id="3">
>>  <div id="context" foo=bar>
>>    <div id=1></div>
>>    <div class="class" id=2></div>
>>    <div class="withChildren" id=3><div class=child id=4></div></div>
>>  </div>
>> </body>
>>
>> you'd get the following behavior:
>>
>> .findAll("div")  // returns ids 1,2,3,4
>> .findAll("")      // returns the context node itself. This was
>> indicated undesirable
>> .findAll("body > :scope > div")  // returns nothing
>
> Wouldn't this return ids 1,2,3 if we're not prepending :scope as you say
> below?
>
>>
>> .findAll("#3")  // returns id 3, but not the body node
>> .findAll("> div") // returns ids 1,2,3
>> .findAll("[foo=bar]") // returns nothing
>> .findAll("[id=1]") // returns id 1
>> .findAll(":first-child") // returns id 1
>>
>> Is this desired behavior in all cases except the empty string? If so
>> this seems very doable to me. We can easily make an exception for the
>> case when the passed in string contains no selectors and make that an
>> error or some such.
>>
>> I do however like the idea that if :scope appears in the selector,
>> then this removes the prepending of ":scope " to that selector group.
>> Is there a reason not to do that?
>>
>> Additionally it seems to me that we could allow the same syntax for
>> <style scoped>. But maybe others disagree?
>
> Sounds good to me. A sticky case you left out is parent, sibling and
> reference combinators.
> .findAll("+ div")
> Assuming the context node has siblings, should that return them?

Indeed, this is a very sticky case. Depending on how you interpret the
selector, this would either never return anything, or it would return
all following siblings to the context node with localName "div".

I think that most authors which use jQuery today would expect all
siblings to be returned. But it's arguably either significantly harder
to implement, or significantly slower to execute.

I.e. either implementations would have to completely change their
implementation strategy, which currently is "test all nodes that could
possibly match and see if the match against the selector", to
"evaluate each step of the selector as an expression which return a
set of nodes, use that set of nodes as input into the next step of the
selector".

It's definitely implementable, but will require significantly more
work for implementations. But given how commonly selector-querying is
done, it just might be worth doing.

> If so,
> should it match siblings when using <style scoped>.
> IMO, it shouldn't match anything in either case. We should assert that
> only descendants of the scope element will ever be returned. This would also
> make it naturally match <style scoped> where only descendants of the scope
> element are ever affected.

Indeed, in scoped stylesheets it seems very awkward to match siblings
of the stylesheet scope. I think I agree that in that context
selectors like "+ div" (or ":scope + div") shouldn't match anything.

I'm less convinced that that is a good idea for .find/.findAll.

Would love input from Alex and Yehuda here.

/ Jonas

Received on Thursday, 20 October 2011 05:41:19 UTC