Re: indexed properties on NodeLists and HTMLCollections from Boris Zbarsky on 2011-05-06 (public-script-coord@w3.org from April to June 2011)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Thu, 05 May 2011 20:52:30 -0400
To: Allen Wirfs-Brock <allen@wirfs-brock.com>
CC: Cameron McCormack <cam@mcc.id.au>, public-script-coord@w3.org
Message-ID: <4DC3464E.8000706@mit.edu>
On 5/5/11 7:17 PM, Allen Wirfs-Brock wrote:
> But native JS objects can also dynamically mutate in apparently arbitrary ways:

Yes, but "apparently" is the key part here.

> var myGlobal = { };
> alert(myGlobal.toString);  //built-in toString function
> callSomeFunctionIdontKnowMuchAbout();
> alert(myGlobal.toString);   //displays some other value because a toString property was added to the myGlobal obj by the function
> callAnotherFunctionIdontKnowMuchAbout();
> alert(myGlobal.toString);  //built-in toString function because the function deleted the property the new toString property from the obj

Yes, but for this to happen the functions you call have to know 
something about myGlobal.  There's no weird side-channel mechanism where 
a function call suddenly defines hundreds or thousands of own properties 
on all sorts of objects, including objects that it can't reach directly 
(because they're only present in closure scopes).  With the way DOM 
nodelists are supposed to work there IS such a side-channel mechanism.

> The only way for a JavaScript programmer to ensure this doesn't happen is via Object.seal, etc.

That's just not true.  Consider this testcase:

   function foo() {
     var obj = {};
     return function mutator() { /* do something with obj */ }
   }
   var f = foo();

Now the JS programmer is guaranteed that there will be no bizarre 
mutations of |obj| that are not explicitly caused by mutator.

>  However, I don't believe that JS programmers actually see this sort of potential mutation as a real problem and few will actually use the new ES5 features that allow them to prevent it.

Yes.  I think you're missing the point.

>> At the same time, this seems like something web authors want in many cases...
>
> If this is true (they want something that is not expressible using native JS objects)

Nodelist behavior (if we pretend that there are no user-defined 
properties on them) is expressible using native JS objects.  It's just 
impossible to do in a performant way using native JS objects.  Native JS 
objects are just too inflexible.

> then it should apply to much more than just the libraries defined by the W3C WebApps WG.  It is essentially a statement that ES is deficient

Yes.  It is, for implementing the behavior that the DOM spec authors 
defined for nodelists in a performant way.

> so we should expect to see web authors clamoring for ES features of this sort

 From what I've seen, web authors see ES as a thing from on high that 
they can't change.  So they don't tend to clamor much....

But more importantly, web authors tend to not write generic enough code 
that these issues arise, while library authors realize the performance 
pitfalls and avoid the sort of situations that DOM nodelists create in 
their library APIs.  Call it smart design or call it working around 
language deficiencies, as you please.

>  From a JS programmers perspective, the DOM is just a library/framework that provides a (dynamic) object model of the rendered display output currently presented by the application.

Yes.

> Increasingly, web applications make use of libraries/frameworks built using native JS objects that provide object models of other complex aspects of the application domain.

Indeed.

> In many cases now and even more so in the future the JS programmer doesn't and shouldn't care which object model is provided by the User Agent implementor

Sure.

> Yet we seem to have two sets of rules for designing libraries/frameworks. One set of rules (WebIDL) for libraries/frameworks designed by the WebApps WG

Note that a lot of this is motivated by the fact that the implementation 
of these is not actually in ES and therefore rules have to be defined 
about how to map the ES interface to the non-ES implementation.  That's 
what we're in the middle of here.

> Another set of rules (the ES spec.) for libraries/frameworks designed/implemented by everybody else.  Why?

Because the latter set of rules is too restrictive to define the 
de-facto behavior of the objects exposed via the former APIs without 
paying performance costs that are too high.

> How can this possibly be good for web authors.

It probably isn't.

>>> In terms of prioritization of specification techniques within the
>>> ES/WebIDL binding I suggest:
>>> 1) normal ES data property and method invocation semantics
>>> 2) use of ES accessor properties in possibly creative ways
>>> ------------------- stop here for all new APIs
>>
>> I think the only way to do that is to disallow [OverrideBuiltins] in new APIs.
>>
>> Of course the non-[[OverrideBuiltins]] behavior is even weirder...
>>
>> Perhaps the only way to do that is to disallow name/index getters/setters altogether.  But web authors seem to want them.
>
> Well, the [OverrideBuiltins] behavior sounds like the normal native JS behavior.

Yes.  However the normal native JS behavior is too restrictive for 
what's needed here.

> The fundamental problem is that this is a design that intermingles data (node names) with program structure (method names) in the same namespace. Collisions are inevitable.

For name getters, sure.

For index getters, this is generally not a problem except insofar as 
it's self-created: someone _can_ define indexed properties on random DOM 
objects, even though it's a bad idea.  And we need to specify how that 
will work if they decide to do it.

Some criteria that I think are worth keeping in mind as we specify this:

1)  The result should make sense.
2)  The result should not make the common cases people care about
     unreasonably difficult or unreasonably slow in order to cater
     to hypothetical edge cases.
3)  The result should be implementable in a DOM entirely implemented
     in ES and not be thousands of times slower than a natively
     implemented DOM.

 > The fix for new APIs should be to simply not do this.

They're not, by and large.  For example, querySelectorAll returns a 
"dead" nodelist which has none of these issues.

 >  ES5 allow creation of objects with a null prototype value.  You can 
use such objects as  name/value maps with no worry about inheritance 
conflict with methods define by the prototype (because there isn't one). 
  That's essentially what you do in other languages and it is 
essentially what you have to do in native JS libraries so why shouldn't 
new web apis follow the same rules.

I'm note quite sure what point you're trying to make about APIs here.

> I realize that this isn't practical for certain legacy DOM APIs.  In those situations we need to do whatever is necessary to maintain compatibility with the legacy web.

OK, we agree on that.  ;)

> But anywhere there is disagreement (and hence lack of interoperability among major browsers) we should try to pick a path that is as close to native JS object semantics as possible.

OK.

>>> The situation would be different if list was a NamedNodeMap. In that
>>> case "a" (or any valid nodeName value) could be the name of a "live"
>>> property that could be dynamically added to the list. You need to
>>> specify the desired semantics for such a live property when its property
>>> name already exists
>>
>> The desired semantics from my point of view as a UA implementor is to not have to worry about whether it exists, because checking that is slow....
>
> Language implementors generally don't get to change the language specification as a way to improve their benchmark scores.

I'm significantly more interested in performance of actual web pages 
than benchmark scores.  And if hypothetical ability to define 
non-configurable properties with integer names on nodelists interferes 
with that, my priorities are going to be pretty clear...  I'll fix the 
spec bugs, eventually.

That said, I'm in a somewhat special position.  I'm both the implementor 
of the DOM _and_ the implementor of the language runtime in this case 
(where "I" == "Mozilla").  I can do whatever the heck I want.  I can 
create catchall set and get hooks that will do magic to make my object 
look native even though in reality it's nothing like a native object.  I 
can implement proxies which have C++ handlers that do all sorts of stuff 
ES5 proxies are not allowed to do.  It's all doable.

What that will mean, though, is that we will have a spec for a DOM 
behavior that is impossible to implement in a performant way with actual 
native objects.  If we make the spec sufficiently clear (which we sort 
of have to, I'd think, to get interoperability) then as things stand it 
it'll also be impossible to implement in a performant way with ES5 
proxies.  _That_ is the state of affairs that led to the original e-mail 
about this to the list.  If people are OK, with that, fine.  I'll just 
go and write some C++ code to make nodelists work fast in the cases my 
users and web developers care about, make them fall back to something 
slow if people do dumb things like defining their own indexed 
properties, and move on with life.

> I don't see why UA implementors shouldn't be held to the same expectations.

It's all fine as long as you want to drop item 3 from my list above.  If 
you don't care about it, I certainly don't have much motivation to. 
After all, I've already got a C++ implementation of all this stuff.

-Boris
Received on Friday, 6 May 2011 00:53:00 UTC