Re: Selectors, getElementsByTagName() and camelCase SVG

On Apr 2, 2009, at 02:21, Boris Zbarsky wrote:

> Maciej Stachowiak wrote:
>> If its a performance concern, then I do believe this can be  
>> implemented efficiently.
>
> For what it's worth, I just went and read the specific spec section  
> Henri is concerned with.  The only concern is behavior of  
> getElementsByTagName (and CSS selector matching), not the internal  
> storage in general.  The spec calls for case-insensitive compares  
> for HTML elements in HTML documents, but case-sensitive compares  
> elsewhere, right?

It seems that getElementsByTagName() invoked on a non-HTML node (e.g.  
SVG node) is specced to compare case-sensitively even into  
<foreignObject> subtrees. If we go down the route of adding pre- 
lowercased atoms for all nodes, the spec should be careful not to  
(effectively) require that pointer to be initialized differently for  
HTML and XHTML nodes to avoid re-introducing the need to initialize  
nodes differently for HTML and XHTML.

>> I note that the current spec says "These methods (but not their  
>> namespaced counterparts) must compare the given argument in an  
>> ASCII case-insensitive manner when looking at HTML elements, and in  
>> a case-sensitive manner otherwise." That could be achieved even  
>> more easily by comparing to the lowercased string for HTML elements  
>> and to the original string passed in for other elements.
>
> Yeah.  This would be a bit of a hassle because it'd require that  
> some existing getElementsByTagName optimizations be dropped, but not  
> that bad, really.
>
> The CSS issue could be addressed similarly; we have existing issues  
> with that in Gecko if SVG is used in an HTML document via  
> createElementNS() that we need to fix anyway.

The same list-based hack that would make parser-inserted SVG-in-text/ 
html respond to selectors and getElementByTagName() would work for  
createElementNS()-inserted SVG, too.

A current spec-wise correct approach that would involve adding a  
pointer to each Gecko nsNodeInfoInner or WebKit QualifiedName and  
reviewing/revising all code that accesses the current localName  
pointer from those and would not gain anything for conforming content  
(except textArea if it is considered conforming). This 'correct'  
approach would only help with the kind of non-conforming cases that  
only Opera now supports spec-wise correctly (ASCII upper case  
introduced to the tree via createElementNS()). Those cases can arise  
but aren't supported in Gecko and WebKit and can't arise in IE,  
because IE doesn't support createElementNS(). Is supporting the non- 
conforming cases worth the trouble? (How do Renesis and the IE text/ 
html parser deal with camelCase names, BTW?)

I guess the solution depends on whether the platform is viewed as  
truly ASCII-case-insensitive for matching (true for Opera) or as early- 
lowercasing and matching on canonical name (true for Gecko and WebKit)  
with SVG camelCase being an unfortunate but grandfatherable quirk. (In  
fairness, there's also one MathML attribute that need special  
treatment: definitionURL, so it's not all SVG. :-)

>> But it might be more convenient to make it always ASCII case- 
>> insensitive for all elements in an HTML document.
>
> This would, of course, be simpler to implement, in the "no changes  
> required" sense... ;)  I'm not sure that someone asking for  
> getElementsByTagName("textarea") would really want all the  
> <html:textarea> and <svg:textArea>, though.


One might argue that SVG <textArea> could be made into a non-issue by  
not implementing <textArea> and by instructing authors to use CSS line  
layout on HTML nodes inside <foreignObject> instead. UAs that...
  * support HTML <textarea>
and
  * support SVG
and
  * have them in the same DOM
and
  * are suitable for browsing the Web
...have a CSS formatter that can render through a transform matrix  
anyway in practice.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Thursday, 2 April 2009 07:10:15 UTC