- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Wed, 01 Apr 2009 19:21:38 -0400
- To: Maciej Stachowiak <mjs@apple.com>
- CC: HTML WG <public-html@w3.org>, www-svg <www-svg@w3.org>
Maciej Stachowiak wrote: > If its a performance concern, then I do believe this can be implemented > efficiently. For what it's worth, I just went and read the specific spec section Henri is concerned with. The only concern is behavior of getElementsByTagName (and CSS selector matching), not the internal storage in general. The spec calls for case-insensitive compares for HTML elements in HTML documents, but case-sensitive compares elsewhere, right? That's significantly simpler than what I thought it called for. > 1) ASCII-lowercase the argument to getElementsByTagName and atomized. > (In WebKit we have a combined operation to ASCII-lowercase and atomize > at the same time, and which is smart enough to make no changes if the > string is already ASCII-lowercase and atomized). > > 2) ASCII-lowercase the tag name to be compared while atomizing (the > optimization makes this essentially free in the common case that the tag > name is already lowercase). > > 3) Pointer compare. This is exactly what Gecko currently does for getElementsByTagName, in fact. It fails to do case-sensitive compares as required above. > I note that the current spec says "These methods (but not their > namespaced counterparts) must compare the given argument in an ASCII > case-insensitive manner when looking at HTML elements, and in a > case-sensitive manner otherwise." That could be achieved even more > easily by comparing to the lowercased string for HTML elements and to > the original string passed in for other elements. Yeah. This would be a bit of a hassle because it'd require that some existing getElementsByTagName optimizations be dropped, but not that bad, really. The CSS issue could be addressed similarly; we have existing issues with that in Gecko if SVG is used in an HTML document via createElementNS() that we need to fix anyway. > But it might be more > convenient to make it always ASCII case-insensitive for all elements in > an HTML document. This would, of course, be simpler to implement, in the "no changes required" sense... ;) I'm not sure that someone asking for getElementsByTagName("textarea") would really want all the <html:textarea> and <svg:textArea>, though. -Boris
Received on Wednesday, 1 April 2009 23:22:28 UTC