XPath and Selectors are identical, and shouldn't be co-developed

Disclaimer: I'm a CSSWG member, was previously a web developer by
trade (now am a webkit engineer/spec author), and was only vaguely
aware of XPath a week ago.

In the recent discussions about XPath, I keep hearing a particular
theme that I believe is untrue, namely that XPath and Selectors
address different use-cases.  For example, Liam recently sent an email
containing the following:

"XPath selectors give a different way of looking at finding things than
CSS selectors and probably appeal in differing amounts to different
people."

"XPath has different goals from CSS selectors, and there's not actually a
battle between them."

Neither of these are true.  The second could have been defended as
true several years ago, but not today.  I will defend my statement,
and then make the further argument that, due to the two being
identical, it is a bad idea to develop both of them.

Both Selectors and XPath take some arbitrary notion of "nodes" with
certain aspects and a tree structure and, starting from the set of all
relevant nodes, repeatedly filter and transform the set until arriving
at a result.  They both do this in effectively identical ways; this
isn't like some concept of "turing equivalence" which can easily be
meaningless.

Selectors has two base concepts: simple selectors (which can be
combined) for filtering, and combinators for transforming.  XPath has
two base concepts: node tests and predicates (which can be combined)
for filtering, and axises for transforming.

Transformations
---------------

Selectors can transform a set to each nodes' children, descendants,
next sibling, or following siblings.  In the planned future as
expressed in Selectors 4, it will also gain the ability to follow an
IDREF attribute (the reference combinator) and gain a limited ability
to reverse any of the existing combinators (the subject indicator).
In the planned-but-not-yet-written future, the combinator-reversing
ability will be completed to allow arbitrary reversals.

XPath can also transform to a node's children, descendant, or
following siblings.  It doesn't have a shortcut for next sibling, but
that can be easily done with a position predicate ("A + B" is the same
as "A/following-siblings::B[1]").  It has shortcuts for two
transformations which can be expressed in Selectors with some
duplication ("A/following::B" is the same as "A ~ B, A ~ * B", while
"A/descendant-or-self::B" is the same as "AB, A B"; both can be
written with less duplication when :matches() is expanded in the
future).  It has full reversal of all of these transformations.  It
has two transformations (attribute and namespace) which appear to be
"transformation" just for syntax reasons - they're simple selectors in
Selectors.  Finally it has the "self" transformation which exists
solely for syntax reasons.  There doesn't appear to be a way to
duplicate Selectors' reference combinator in XPath (I'm not sure if
"/B[id(/A/@for)]" or something like that is valid?).

So, on the transformation side, the two are effectively identical.
Selectors has a transformation that can't be expressed in XPath, while
XPath has several reverse-document-order transformations that can only
partially be expressed in current planned Selectors, and will need to
wait for planned future extensions to be fully expressible.  XPath
also has a few convenience transformations that require a bit more
work to express in Selectors.

Filtering
---------

There's a lot more to look at in the filtering side, so I won't go
into detail.  In short, XPath can select elements by name, id, or
language, can do arbitrary simple math expressions over the size of
the current node set or the position of a node within it, and can do
arbitrary simple string manipulation and testing over the name or
namespace of a node or the name or value of an attribute.  Selectors
can select elements by name, namespace, id, class, language, attribute
name or value, can do a limited form of math over the position of a
node within its siblings, can do some types of string manipulation and
testing over attribute values, and then a bunch of other selectors,
mostly specialized for HTML.

Again, the two are almost the same.  XPath is slightly more powerful
with math and string testing, while Selectors has a wider set of
built-ins that are better specialized for HTML.  Within the context of
Selectors used in JS, we are planning to make it very easy to use the
two together, which allows for all the functionality of XPath and much
more.

The Goals of the Two
--------------------

Historically, Selectors was designed for CSS while XPath was designed
for XSLT.  This meant that Selectors inherited a bunch of performance
requirements that XPath didn't, since CSS had to apply continuously to
a dynamic document while XSLT was a one-time transformation.  However,
Selectors is now also used in Javascript, where it is also a one-time
affair.

Thus, within the context of JS, Selectors and XPath solve identical
problems with identical performance concerns.  There is no difference
in their goals.

As well, as I've demonstrated above, they don't provide "different
[ways] of looking at finding things".  They have only minor
differences in functionality, and a slightly different syntax.  They
are otherwise identical.

Since XPath and Selectors are 95% overlapping in functionality and
100% overlapping in goals and overall structure, I believe it is a bad
idea to try and develop both for the web platform.  Instead, we should
continue to develop Selectors, as it has clear mindshare on the web,
and possibly mine XPath for future ideas for Selectors.  There's not
much there left to mine, luckily, since Selectors has already absorbed
the majority of it.  If we explicitly develop a "batch processors"
profile for Selectors, we can more freely grab the last bits of
XPath's axises.

~TJ

Received on Tuesday, 29 November 2011 19:08:29 UTC