- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Wed, 01 Apr 2009 19:21:38 -0400
- To: Maciej Stachowiak <mjs@apple.com>
- CC: HTML WG <public-html@w3.org>, www-svg <www-svg@w3.org>
Maciej Stachowiak wrote:
> If its a performance concern, then I do believe this can be implemented
> efficiently.
For what it's worth, I just went and read the specific spec section
Henri is concerned with. The only concern is behavior of
getElementsByTagName (and CSS selector matching), not the internal
storage in general. The spec calls for case-insensitive compares for
HTML elements in HTML documents, but case-sensitive compares elsewhere,
right?
That's significantly simpler than what I thought it called for.
> 1) ASCII-lowercase the argument to getElementsByTagName and atomized.
> (In WebKit we have a combined operation to ASCII-lowercase and atomize
> at the same time, and which is smart enough to make no changes if the
> string is already ASCII-lowercase and atomized).
>
> 2) ASCII-lowercase the tag name to be compared while atomizing (the
> optimization makes this essentially free in the common case that the tag
> name is already lowercase).
>
> 3) Pointer compare.
This is exactly what Gecko currently does for getElementsByTagName, in
fact. It fails to do case-sensitive compares as required above.
> I note that the current spec says "These methods (but not their
> namespaced counterparts) must compare the given argument in an ASCII
> case-insensitive manner when looking at HTML elements, and in a
> case-sensitive manner otherwise." That could be achieved even more
> easily by comparing to the lowercased string for HTML elements and to
> the original string passed in for other elements.
Yeah. This would be a bit of a hassle because it'd require that some
existing getElementsByTagName optimizations be dropped, but not that
bad, really.
The CSS issue could be addressed similarly; we have existing issues with
that in Gecko if SVG is used in an HTML document via createElementNS()
that we need to fix anyway.
> But it might be more
> convenient to make it always ASCII case-insensitive for all elements in
> an HTML document.
This would, of course, be simpler to implement, in the "no changes
required" sense... ;) I'm not sure that someone asking for
getElementsByTagName("textarea") would really want all the
<html:textarea> and <svg:textArea>, though.
-Boris
Received on Wednesday, 1 April 2009 23:22:33 UTC