Selectors API Method Names

Hi,
   The naming of the methods in the Selectors API specification has been 
heavily debated.  I understand the reason for this debate.  There is a 
long history of bad naming (XMLHttpRequest, AJAX, etc.) and there is 
clearly a desire to avoid that again.  However, it's virtually 
impossible to come up with the perfect name, since everyone has a 
different opinion of what that may be.

My rationale in resolving this issue has not depended upon the majority 
opinion.  Arguments based on nothing more than personal preference have 
not been given much weight, simply because everyone has different 
opinions and it's impossible to rank one above another.  All arguments 
based on logic, reasoning or technical issues have been taken into 
account and judged on their own merit, not the credentials of the people 
making the arguments.

With that in mind, and given that the arguments that have been put 
forward are sometimes mutually exclusive, please understand that 
whatever the final decision, it is simply not possible to please 
everyone, but I hope we can all accept it.


*Rationale for Naming Principles*

A short name is important for several reasons.  Since these methods are 
designed, and behave, as a superset of the existing getElementsBy* 
methods, it is likely that their use will significantly replace the use 
of previous methods.

Based on evidence from many widely used JS libraries, and other 
feedback, we know that authors generally prefer shorter names over 
longer names.  This is particularly true for methods that will be 
frequently used.

Not only does a shorter name reduce the amount of typing, it can help to 
improve the readability of the source code, which in turn makes code 
easier to maintain.  Given these reasons, I have concluded that, within 
reason, short names are a fundamental requirement that must be adhered to.

Ideally, the chosen method names would be relatively clear and 
unambiguous with regards to their purpose, usage and return value. 
However, this must be put into perspective; the names do not need to 
describe these aspects perfectly.

One of the few names that meets this specific requirement perfectly is 
getElementsByGroupOfSelectors, but that name is clearly too long.  An 
appropriate balance needs to be found between clarity and convenience.

With a somewhat intuitive name, it is expected that frequent use by 
authors will alleviate any remaining concerns about the constant need to 
refer to documentation arising from any ambiguity in the name.  For 
example, despite the name being completely non-descriptive, many authors 
frequently use $() as an alias for document.getElementById() without the 
need to constantly lookup what it means.

The name must not clash with any existing DOM API, nor be too similar to 
an API that could cause confusion amongst authors.  It is also somewhat 
important to choose a name that is less likely to clash with a future 
API, though this is difficult because it is impossible to predict the 
future.  It can, however, be helped by choosing a relatively unambiguous 
name.  For example, choosing selectAll() as one of the names would cause 
significant confusion because the name is commonly associated with, and 
very similar to, text selection APIs.

For the two methods, it is desirable that the chosen names are somewhat 
related to each other.  However, because there is a need to maintain 
code readability, the two chosen names cannot not be too similar to each 
other because it would reduce maintainability of code.

It has been argued that the chosen names should be in line with the 
conventions of existing DOM APIs, including:

* getElementById
* getElementsByName
* getElementsByTagName
* getElementsByTagNameNS
* getElementsByClassName (proposed in HTML5)

The convention is to use getElement* for methods that return a single 
node and getElements* for methods that return multiple nodes.  However, 
given that we need two separate methods for these APIs, this is not 
desirable as it conflicts the need for readability.

For example, choosing getElementBySelector() and getElementsBySelector() 
would not be good choices because they only differ by a single 
character.  This would make it easier to make a mistake by typing the 
wrong method, and also make it more difficult to recognise the error 
when debugging code.

There is precedence for not following this convention for similar 
methods.  DOM Level 3 XPath defines the evaluate() method for evaluating 
XPath expressions. (Although, that method is slightly different because 
it has a variable return type.)

In addition, Microsoft's .NET uses selectSingleNode() and selectNodes() 
for their proprietary XPath implementation.  Unfortunately, this also 
means that those and similar names cannot be used for these APIs.


*Summary of Naming Principles*

* Short
* Somewhat descriptive of the functionality
* Clear, concise and relatively unambiguous
* Avoid clashes with other APIs due to ambiguous naming
* Easy to type (can't rely on autocomplete)
* Easy to read


*Rejections*

The following names have been rejected for the reasons detailed below.

* match()                             matchAll()
* matchSelector()                     matchAllSelectors()
* matchSelectors()                    matchAllSelectors()

match() is already used for regular expressions matching on Strings in 
ECMAScript and it is considered better to use a less ambiguous name.  It 
is also not clear whether match() would return a single or multiple 
elements, without having to assume based on the other being matchAll().

The matchSelector/matchAllSelectors variants have been rejected because 
the names seem to be misleading.  They create the impression that the 
former matches against a single selector and the latter against multiple 
selectors, instead of returning single and multiple results, respectively.

* select()                            selectAll()
* selectOne()                         selectAll()
* selectFirst()                       selectAll()
* selectSingle()                      selectAll()
* selectSingleNode()                  selectNodes()
* selectNode()                        selectNodeList()

These select*() variations are either in direct conflict with, or very 
similar to, existing APIs, and their use would result in confusion 
amongst authors.

* get()                               getAll()
* getOne()                            getAll()

These were rejected because they are not descriptive at all, and they 
are too ambiguous. They were not well received by almost anyone.

* getElementBySelector()              getElementsBySelector()
* getElementBySelectors()             getElementsBySelectors()
* getElementByCSSSelector()           getElementsByCSSSelector()
* getElementBySelector()              getElementListBySelector()
* getElementBySelectors()             getElementListBySelectors()
* getElementByGroupOfSelectors()      getElementsByGroupOfSelectors()
* getElementByGroupOfSelectors()      getElementListByGroupOfSelectors()

These were all rejected because they are far too long.  While they are 
very clear and mostly follow the established convention, they are not 
concise and do not satisfy the length requirement.

* nodeBySelector()                    nodeListBySelector()
* getNode()                           getNodes()
* getNode()                           getAllNodes()
* getNode()                           getNodeList()
* getNodeBySelector()                 getNodeListBySelector()
* getNodeByExpr()                     getNodeListByExpr()
* getBySelector()                     getBySelectorAll()

These variants using node instead of element were rejected for a few 
reasons.  Selectors only select elements, not all types of nodes, and it 
doesn't seem likely that selectors would be extended to select 
non-element nodes in the future.  These also break the established 
convention of using getElement, and there is no reasonable justification 
for doing so in these cases.

* css()                               cssAll()
* cssQuery()                          cssQueryAll()
* matchCSS()                          matchCSSAll()

Selectors aren't just for CSS, as this API clearly demonstrates. 
Although there is a common association between selectors and CSS, there 
is no reason to encourage this misconception.  The names create the 
impression that they deal with CSS styles, rather than selecting elements.

Although there has been a previous JavaScript implementation of cssQuery 
in the past, this is not considered sufficient justification for using 
the ambiguous name.


*Candidates*

After careful consideration, I've narrowed down the remaining options to 
these seven pairs.

* matchSingle()                       matchAll()
* matchOne()                          matchAll()
* getElement()                        getElementList()
* getElement()                        getElements()
* selectElement()                     selectElementList()
* selectElement()                     selectAllElements()
* chooseOne()                         chooseAll()

These are summarised with their pros and cons below:

* matchSingle()/matchOne()/matchAll()

These names short, easy to type and easy to read.  The choice between 
matchOne and matchSingle would effectively come down to personal 
preference.  Although matchOne is shorter, it's not significantly better 
than matchSingle.

The advantage of using matchSingle and matchAll is that there is an 
existing JavaScript implementation of these methods in Dean Edwards' 
Base2 library.  While the implementation could be considered evidence in 
support of these names, it must be noted that these names were 
implemented simply because they were the names in the spec at the time.

The names aren't completely clear and unambiguous and it must be noted 
that these names did not receive wide acceptance when they were put in 
the draft, and so choosing these names probably wouldn't be the most 
productive choice.

* getElement()/getElementList()/getElements()

The advantage of these methods is that they somewhat follow the 
established convention, although not completely because they don't 
specify BySelector (or equivalent).  Given that these APIs are 
effectively a superset of existing getElement* methods, it makes some 
sense to use names that recognise that.

The problem with choosing getElement and getElements is that they are 
too similar to each other, which reduces code readability, and so using 
getElementList would be a workaround for that issue.

* selectElement()/selectElementList()/selectAllElements()

Overall, these names are relatively good.  While they are not the 
shortest alternative, they are not too long; they are relatively easy to 
type and are easy to read; and they are clear and concise.  By using the 
word "select", which is easily associated with Selector, they are 
somewhat more descriptive than the getElement variations.

One problem is the use of "select*" is similar to the .NET XPath methods 
(selectSingleNode/selectNodes), though the use of Element instead of 
Node reduces the confusion slightly.  Those are also proprietary methods 
that aren't used outside of .NET (The DOM3XPath standard uses evaluate() 
instead.).

Several people expressed a preference for select() and selectAll(, 
though they inevitably had to be rejected due to clashes and ambiguity 
with the name.  Using selectElement and selectAllElements instead seems 
like a good compromise that solves the problem.

* chooseOne()/chooseAll()

The word "choose" is an alternative verb to the word "select"; however 
it's a slightly more ambiguous term.  While these are shorter, the 
advantage of length isn't quite enough to sacrifice the clarity of the name.


*Conclusion*

After carefully considering all of these reasons, I have update the spec 
to use selectElement() and selectAllElements(), based on the arguments 
given above.

http://dev.w3.org/cvsweb/~checkout~/2006/webapi/selectors-api/Overview.html?content-type=text/html;%20charset=UTF-8

-- 
Lachlan Hunt
http://lachy.id.au/

Received on Saturday, 23 June 2007 05:23:21 UTC