Re: Native DOM way to get nodes of arbitrary type/name from Glenn Maynard on 2013-10-04 (www-dom@w3.org from October to December 2013)

From: Glenn Maynard <glenn@zewt.org>
Date: Fri, 4 Oct 2013 17:02:35 -0500
To: Marat Tanalin <mtanalin@yandex.ru>
Cc: "www-dom@w3.org" <www-dom@w3.org>
Message-ID: <CABirCh89WbBXBcmPzQsrGUd+zfUxYs-18zQ_5C2Ot-pECr7WVw@mail.gmail.com>

On Fri, Oct 4, 2013 at 3:58 PM, Marat Tanalin <mtanalin@yandex.ru> wrote:

>     * applying typography tricks like hanging punctuation [1];
>

This would belong in CSS.

>     * automatic (re)formatting of texts in web-based WYSIWYG editors
>       (e.g. replacing `--` with `—`, or inserting nonbreaking spaces,
>       or removing processing-instruction nodes);
>

Maybe.

>     * removing whitespace-only text nodes between child elements
>       of an element (to work around browser bugs in particular --
>       for example Safari 5 and older has well-known bug related
>       to that whitespace width is not zero even when font size is zero);
>

Adding features to work around browser bugs doesn't make sense.  The
features won't exist until a future version of the browser anyway, so they
should just fix the bug.

>     * online client-side (functioning without sending anything to server)
>       HTML-processing tools based on browser's DOM;
>

This seems like a description of the API, rather than a use case.

>     * joky transformations of texts (e.g. shuffling letters in words
>       during All Fools' Day).
>

(Sorry if this seems a bit contrived.  :)

Important good thing about the methods I've proposed is that they are
> _universal/general_ enough and provide ability to get nodes of _any_ type
> -- without need for somewhat polluting DOM standard with dedicated methods
> separately for each node type (e.g. `getCommentNodes()`, `getTextNodes()`,
> `getProcessingInctructionNodes()` [+ their `getChild-` variants] like
> existing `getElementsByTagName()`).
>
> Also, as I've already mentioned in the original message,
> `getChildNodesByType()` would provide ability to get direct-child elements
> of specific tag-name (e.g. get `TH` cells, but not `TD` cells that are
> direct child elements of a `TR` element) which is currently impossible at
> all and will be potentially slower anyway using upcoming `findAll()`.
>

I don't really understand what you mean (what does "TH cells but not TD
cells" mean?), but you can already use querySelectorAll() to match using
CSS selectors, eg. element.querySelectorAll("TH").  ("Potentially slower"
isn't very interesting--you first need to show a real performance problem.
CSS selectors are very fast.)

 > here's how you can do this without manually recursing yourself.
>
> Thanks, `createTreeWalker()` functionality is interesting and somewhat
> more neat, but, in essence, retrieving nodes of arbitrary type with it is
> still pure-script, not much usable compared with a native method, and most
> likely noticeably slower than a potential native implementation.
>

If you want to make a performance argument, you'll want to show 1: that the
non-native implementation is actually slow enough to cause real-world
issues, and 2: that a native implementation is actually significantly
faster.  I doubt that this is actually materially slower, actually, and I
suspect many browsers wouldn't implement this natively at all.

>  > Note that you may be surprised by the results of "all text nodes".  For
> example, it'll include inline scripts.
>
> I'm aware of that. Inline scripts in general are considered bad form, so
> that's not a problem for me (as well as for probably any web developer
> following good practices).
>

(I disagree.  I certainly don't try to move every single piece of script
into external scripts.)

-- 
Glenn Maynard

Received on Friday, 4 October 2013 22:03:02 UTC