[whatwg] Use cases for Node.getElementById from Calogero Alex Baldacchino on 2008-12-07 (public-whatwg-archive@w3.org from December 2008)

From: Calogero Alex Baldacchino <alex.baldacchino@email.it>
Date: Sun, 07 Dec 2008 20:07:24 +0100
Message-ID: <493C1EEC.6070805@email.it>
Jo?o Eiras ha scritto:
>
> IMO, anyone suggesting a Node.getElementById clearly does not know 
> very well how getElementById is supposed to work.
> There are ways to transverse a DOM tree currently, either DOM 
> properties and methods, XPath, selectors API and such.
> Considering ids are required to be unique in the context of a single 
> document, implementations can, and do, implement id lookup using 
> optimized data structures like a hash table, which is much more 
> performant than doing transversal.
> So if there is a special node in a document, add an id to it and get 
> its reference will be performant (ideally O(1)).

Such a hash table cannot prevent at all the need of traversing the DOM 
tree for the purpose of a _correct_ implementation of .getElementById. A 
DOM tree is a live structure, so the hash table must be checked and 
updated each time a node is removed AND each time a node is inserted, 
for a couple of reasons, and such update may request some kind of tree 
traversing (i.e. to compare nodes relative position). Actually, 
getElementById is being defined as returning the _first_ element with a 
matching ID, as a graceful degradation in case of duplicate IDs and to 
give a better standard (= unique) definition of the expected behavior in 
front of duplicate IDs, than what stated in DOM 3 Core (which leaves 
such behavior unspecified -- it's said to be undefined -- and possibly 
implementation or document specific); this means that, upon insertion of 
a new element, this one might be the new 'first' element with a certain 
id, so its order must be checked and the hash table updated accordingly. 
When an element is removed, independently of the previous scenario, if 
it was in the hash table it might be just removed from the table a well, 
but such wouldn't work fine, because there might be a descendant, or an 
otherwise following element with the same id: after the removal, such 
element would pass from the 'illegal' state of being a duplicate-ID 
element, to the 'legal' state of being the current element to be 
returned by getElementById => the existence of such an element must be 
checked and the hash table updated accordingly. If there are far more 
insertions and/or removals of elements with the id attribute set, than 
calls to getElementById, the advantage of a live hash table vs 
traversing as needed can be quite lost; anyway, a traversal can be quite 
fast, especially if the DOM structure is implemented as a balanced 
binary tree (and I hope you don't wish to implement any kind of 
non-binary tree as the base tree structure).

>
> If the uniqueness requirement is removed, then getElementById looses 
> its whole meaning and should actually be removed from the 
> specification entirely, else then we would need more bloat like 
> getElementById or getElementListById and whatever.

Do you thing that getElementsByTagName and getElementsByClassName are 
bloaty and useless too? However, my point was, and is, another (I'm not 
for Node.getElementById - nor I am strongly against it).

>
> If you really need to get the element with id in a subtree, connected 
> or disconnected from the main tree, one can use selectors API, DOM 
> transversal, XPath, etc.

Currently, the id uniqueness is defined such as constraining not only a 
whole document, but also a disconnected subtree. Then, what API is such 
constraint relevant for? If none, is it worth to declare such constraint 
for disconnected subtrees? Or, is there any need for an API directly 
handling IDs in disconnected subtrees?

In other words, what's being constrained by the id uniqueness in a 
disconnected subtree? A disconnected subtree may be a subtree of another 
document, different from the one currently handled by a script; in this 
case, the id uniqueness is relevant for the actual document containing 
the subtree (while any other document shouldn't be affected by 
cross-document IDs clashes). Otherwise, it may be a subtree external to 
any document, and in such case, perhaps, it might be out of scope for 
HTML 5 documents specification. I'm starting to think that at most it 
might be said, for disconnected subtrees outside any actual html 
document but consisting of html elements, that any API dealing with 
unique identifiers in a disconnected subtree of html elements must treat 
the value of any such element's id attribute as the element default ID 
(the id value uniqueness being a consequence of both its nature as ID 
property and the nature of an API methods targeting an element ID 
property, but not imposed by the specifications, since currently there 
is no such method in the scope of HTML 5 DOM). As a consequence, the id 
value uniqueness might be in scope for a DOM Core specification 
explicitly willing to handle ID properties in a disconnected (and 
'document-less') subtree of Elements, just because the id value 
represent (at least) the first attribute of an HTML element to be 
evaluated looking for an ID property.

Regard, Alex.
 
 
 --
 Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f
 
 Sponsor:
 Check-up finanziario di Intesa Sanpaolo. Prenotalo subito online, ? gratis e senza impegno.
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8437&d=7-12
Received on Sunday, 7 December 2008 11:07:24 UTC