Re: Defining a constructor for Element and friends from Boris Zbarsky on 2015-01-07 (public-webapps@w3.org from January to March 2015)

From: Boris Zbarsky <bzbarsky@mit.edu>
Date: Wed, 07 Jan 2015 08:32:20 -0500
To: public-webapps@w3.org
Message-ID: <54AD3564.7080000@mit.edu>
On 1/7/15 6:17 AM, Anne van Kesteren wrote:
> The main tricky thing here I think is whether it is acceptable to have
> an element whose name is "a", namespace is the HTML namespace, and
> interface is Element.

That depends on what you mean by "interface is Element".

If you mean that it has all the internal slots HTMLAnchorElement has but 
its prototype is Element.prototype, I think that may be fine.  Libraries 
might get confused if you pass them elements like this, but that just 
comes down to "don't create elements like this" as a guideline, right?

If you mean not having the internal slots HTMLAnchorElement has, then 
that would involve a good bit of both specification and implementation 
work.  Specifically:

1)  Pretty much the entire HTML spec is written in terms of tag names, 
and the operations it performs often assume some sort of state being 
stored on elements with those tag names.  Conceptually this is being 
stored in internal slots (though of course in actual implementations 
"slots" can mean "hashtable entries in some hashtable" or whatnot). 
Significant spec work would need to happen to deal with situations where 
the element has some tagname but not the corresponding internal slots.

2)  In specifications, there are assumptions about particular tag names 
having particular internal slots.  For example, you often get code like 
this (not actual code in either one, but intended to get the flavor 
across) at least in WebKit and Gecko:

   void doWhatever(Element* element) {
     if (element->isHTML() && element->tagName() == "input") {
       HTMLInputElement* input = static_cast<HTMLInputElement*>(element);
       // Do stuff with "input" here.
     }
   }

If we can have HTML elements which have the "input" tag name but aren't 
represented by a subclass of the C++ HTMLInputElement in the above code, 
you get a security bug.  So codebases would have to be audited for all 
instances of this and similar patterns.  I just did a quick check in 
Gecko, and we're looking at at least 500 callsites just in C++.  There 
are probably more in privileged JavaScript that make assumptions about 
things based on tagname...

This is why the custom elements spec ended up with the is="..." business 
for extending nontrivial HTML elements.  :(

So just to check, which of these two invariant-violations are we talking 
about here?

> If we can break that invariant it seems rather easy to build the
> hierarchy. The HTMLElement constructor would only take a local name
> and always have a null prefix and HTML namespace.

I think that's fine in a world where we still create an 
HTMLAnchorElement under the hood if you do "new HTMLElement('a')" but 
just give it HTMLElement.prototype as the proto.

> And HTMLAnchorElement would always be "a". HTMLQuoteElement could accept
> an enum and we could even add dedicated constructors for <q> and
> <blockquote> (provided the web allows).

Yeah, this would make sense to me.

-Boris
Received on Wednesday, 7 January 2015 13:32:50 UTC