Re: Selectors, getElementsByTagName() and camelCase SVG

Maciej Stachowiak wrote:
> If its a performance concern, then I do believe this can be implemented 
> efficiently.

For what it's worth, I just went and read the specific spec section 
Henri is concerned with.  The only concern is behavior of 
getElementsByTagName (and CSS selector matching), not the internal 
storage in general.  The spec calls for case-insensitive compares for 
HTML elements in HTML documents, but case-sensitive compares elsewhere, 
right?

That's significantly simpler than what I thought it called for.

> 1) ASCII-lowercase the argument to getElementsByTagName and atomized. 
> (In WebKit we have a combined operation to ASCII-lowercase and atomize 
> at the same time, and which is smart enough to make no changes if the 
> string is already ASCII-lowercase and atomized).
> 
> 2) ASCII-lowercase the tag name to be compared while atomizing (the 
> optimization makes this essentially free in the common case that the tag 
> name is already lowercase).
> 
> 3) Pointer compare.

This is exactly what Gecko currently does for getElementsByTagName, in 
fact.  It fails to do case-sensitive compares as required above.

> I note that the current spec says "These methods (but not their 
> namespaced counterparts) must compare the given argument in an ASCII 
> case-insensitive manner when looking at HTML elements, and in a 
> case-sensitive manner otherwise." That could be achieved even more 
> easily by comparing to the lowercased string for HTML elements and to 
> the original string passed in for other elements.

Yeah.  This would be a bit of a hassle because it'd require that some 
existing getElementsByTagName optimizations be dropped, but not that 
bad, really.

The CSS issue could be addressed similarly; we have existing issues with 
that in Gecko if SVG is used in an HTML document via createElementNS() 
that we need to fix anyway.

> But it might be more 
> convenient to make it always ASCII case-insensitive for all elements in 
> an HTML document.

This would, of course, be simpler to implement, in the "no changes 
required" sense... ;)  I'm not sure that someone asking for 
getElementsByTagName("textarea") would really want all the 
<html:textarea> and <svg:textArea>, though.

-Boris

Received on Wednesday, 1 April 2009 23:22:28 UTC