W3C home > Mailing lists > Public > www-dom@w3.org > October to December 2013

Re: Native DOM way to get nodes of arbitrary type/name

From: Marat Tanalin <mtanalin@yandex.ru>
Date: Sat, 05 Oct 2013 00:58:25 +0400
To: Glenn Maynard <glenn@zewt.org>
Cc: "www-dom@w3.org" <www-dom@w3.org>
Message-Id: <217651380920305@web8j.yandex.ru>
04.10.2013, 23:58, "Glenn Maynard" <glenn@zewt.org>:
> On Fri, Oct 4, 2013 at 2:27 PM, Marat Tanalin <mtanalin@yandex.ru> wrote:

>>     * Another usecase is processing text nodes via JavaScript
>>       in browser.
> FYI, a use case is something you want to accomplish, such as 
> "modify all text on a web site to be in alternating caps".
> (That's not a use case for this, of course--that'd be CSS's job.)
> Given that, can you give a concrete use case?

There are multiple possible usecases for text processing (and text nodes are just one of node types different from element nodes), for example:

    * applying typography tricks like hanging punctuation [1];

    * automatic (re)formatting of texts in web-based WYSIWYG editors
      (e.g. replacing `--` with `—`, or inserting nonbreaking spaces,
      or removing processing-instruction nodes);

    * removing whitespace-only text nodes between child elements
      of an element (to work around browser bugs in particular --
      for example Safari 5 and older has well-known bug related
      to that whitespace width is not zero even when font size is zero);

    * online client-side (functioning without sending anything to server)
      HTML-processing tools based on browser's DOM;

    * joky transformations of texts (e.g. shuffling letters in words
      during All Fools' Day).

I have encountered the task of retrieving nodes of arbitrary type often enough to finally write and send this proposal.

Also, it just looks like an inconsistency that we have native ways to retrieve element nodes, but don't have native ways to retrieve nodes of other types (so we have multiple node types: element nodes, text nodes, comment nodes, etc., but we can _natively_ search for _element_ nodes only -- it looks like sort of discrimination of nodes of other types. ;-).

Important good thing about the methods I've proposed is that they are _universal/general_ enough and provide ability to get nodes of _any_ type -- without need for somewhat polluting DOM standard with dedicated methods separately for each node type (e.g. `getCommentNodes()`, `getTextNodes()`, `getProcessingInctructionNodes()` [+ their `getChild-` variants] like existing `getElementsByTagName()`).

Also, as I've already mentioned in the original message, `getChildNodesByType()` would provide ability to get direct-child elements of specific tag-name (e.g. get `TH` cells, but not `TD` cells that are direct child elements of a `TR` element) which is currently impossible at all and will be potentially slower anyway using upcoming `findAll()`.

> here's how you can do this without manually recursing yourself.

Thanks, `createTreeWalker()` functionality is interesting and somewhat more neat, but, in essence, retrieving nodes of arbitrary type with it is still pure-script, not much usable compared with a native method, and most likely noticeably slower than a potential native implementation.

> Note that you may be surprised by the results of "all text nodes".  For example, it'll include inline scripts.

I'm aware of that. Inline scripts in general are considered bad form, so that's not a problem for me (as well as for probably any web developer following good practices).


[1] http://www.artlebedev.com/mandership/120/
Received on Friday, 4 October 2013 20:59:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 20 October 2015 10:46:21 UTC