W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2010

Re: DOM collections index out of bounds and JavaScript.

From: Cameron McCormack <cam@mcc.id.au>
Date: Wed, 3 Nov 2010 11:15:08 +1300
To: Garrett Smith <dhtmlkitchen@gmail.com>
Cc: Anne van Kesteren <annevk@opera.com>, Erik Arvidsson <arv@chromium.org>, public-webapps@w3.org
Message-ID: <20101102221508.GA2838@wok.mcc.id.au>
Garrett Smith:
> OK, glad we're on the same page. I think the wording of WebIDL should
> be changed to reflect that because what it has now suggests passing a
> copy. Instead, the method that accepts the object uses it for a copy
> and discards it (this method can be trusted not to leak memory of the
> object passed in).

Yeah, I agree the wording could be improved.  (Although I could argue
that it is not forced to take a copy unless it needed to – that could be
a black box optimisation.)

> >> But where is the array host object used?
> >
> > That’s for arrays – type T[] – not sequences.
> >
> Array host object is for arrays – type T[] –? Please explain more
> about what that means.

Let’s say you have this interface:

  interface A {
    readonly attribute octet[] x;
    sequence<octet> y();

In ECMAScript, getting the ‘x’ property will return a reference to an
array host object, which is an object that behaves kind of like a native
Array, but which also coerces values stored in it to the right type
(octet, here).  It also doesn’t allow sparseness like native Arrays do.
If you did something like

  myAObject.x[1] = 128;

then the A object might notice this and do something accordingly.  The
array host object could also change the values of its elements:

  var bytes = myAObject.x;
  alert(bytes[0]);  // a number from 0 to 255.
  // some time later
  alert(bytes[0]);  // could be a different number now!

Calling y, on the other hand, will return a real native Array object.

> >> That seems like the main issue to me. What are a Collection object's
> >> indexed properties? Are they real object properties or is there a
> >> proxy for [[Get]] and [[Has]]? Both behaviors can be seen in
> >> implementations but it varies, depending on the implementation and on
> >> the object.
> >
> > As currently defined, they are real properties.
> >
> OK, so no delegating to item, then.

Well, it does delegate to item in that these real properties are
accessor properties, whose [[Get]] will call item.

> And for each instance of implementation of a "collection" that use a
> proxy, a bug should be filed against that implementation.

If the implementors are happy with how getters/setters are defined in
the spec, yes. :-)

> >> But where are proxies really needed? How important is it, for example,
> >> for document.styleSheets[-1] to throw an "index out of bounds"
> >> exception, or for document.childNodes(99999) to return null instead of
> >> undefined?
> >
> > I think document.styleSheets[-1] returning null or throwing an exception
> > is inconsistent with, say, regular ECMAScript arrays, so IMO that’s not
> > something we should encourage.
> >
> You're right -- it's inconsistent with how property access works in
> ES, but there's a conflict in existing DOM specs:
> "Dereferencing with an integer index is equivalent to invoking the
> item function with that index."
> http://www.w3.org/TR/DOM-Level-3-Core/ecma-script-binding.html
> See the disparity? That means coll[ coll.length ] should be equivalent
> to coll.item( coll.length ), which would result `null`. Why not just
> allow `coll[ coll.length ]` to result `undefined`?

Indeed, why not? :)  I think that sentence from DOM 3 Core is just loose
wording, to be honest, and I doubt the author had any intention to make
those properties be either null or undefined specifically.

> HTML 5 re-spec'd HTMLCollection in a way that requires indexed
> property access to return `null` if the index is out of range.
> http://www.w3.org/TR/html5/common-dom-interfaces.html#htmlcollection
> | collection(index)
> | Returns the item with index index from the collection.
> | The items are sorted in tree order.
> |
> | Returns null if index is out of range.
> ^^^^^^^^^^^^^^

That’s talking about the operation.  The thing is, the operation won’t
be invoked (due to the delegation thing above) if the index is out of
range, because the index property won’t exist on the object.

> This requirement seems very strange to me. If `index` is out of range
> for a call to `item`, then it returns null? The range for `index` is
> unsigned long, that means that it is
>  document.links[ 4294967295 ] // undefined
>  document.links[ 4294967296 ] // null
> Perhaps by "range," the author meant "a member of the collection". And
> if that is what he meant, then why not spec that?

By range there I’m pretty sure it means the range of currently valid

> And then if we use that interpretation then there is the point of
> contention about the property access resulting in `null` and not
> `undefined` -- *that* is what we're discussing here, and you believe
> that `undefined` would acceptable because it is follows native ES
> semantics, requiring no proxy.
> So property access should not be specified to delegate to item.

It delegates, but *only* due to accessing an existing index property,
and such a property will only exist if that index is already in range.

> >> It seems that some implementations have gone out of the way to use
> >> proxies to adhere to the spec to fulfill the odd cases above (albeit
> >> inconsistently) while others have chosen to just use property access
> >> to return undefined.
> >
> > I’m not sure that implementations are slavishly following the Web IDL
> > spec just yet.
> >
> It is widely known and I've provided plenty of examples of this here
> and on WHATWG mailing list that implementations do use proxies. Some
> examples appear in "Adding ECMAScript 5 array extras to
> HTMLCollection".
> So the question is not "if" but "why": Why do implementations use
> proxies?

By “proxies” do you mean “a host object with a custom [[Get]] and
[[Put]] (or [[DefineOwnProperty]]) that responds to property names that
are valid array indexes in a particular way”?  I don’t think the proxy
proposals are implemented yet.

If I had to guess: it’s probably easier.  You’d have to ask the
implementors, though.

> […]
> >> I get that the spec requires ob[n] to delegate to item, and so for
> >> that a proxy is needed. But what type of situation is it really
> >> necessary for obj[n] to delegate to item? Which Collections really
> >> need a proxy to function as required by code?
> >
> > I’m not sure that proxies are required to make ob[n] delegate to item,
> So where `x.item( -1 )` and `x( -1)` each throw an error, how do you
> make x[ -1 ] do the same? You can't; not unless you use a proxy for
> specialized [[Get]] access, which then forwards the call to `item`.

That is true.  (Assuming you meant `x[-1]`.)  Unless you wanted to have
all 4 billion properties existing on the object. :)

> Some implementations do this for certain collections but others. It's
> wildly inconsistent. If it mattered at all, wouldn't we be seeing bugs
> on this and hearing implementors saying "no, it cannot happen" - ?

It is possible it matters; my guess was that it wouldn’t.  I haven’t
researched too deeply.  I am relying on implementors pushing back on how
this is defined, if it does matter.

> > as long as n is always the name of a property that exists on the object.
> > That’s how the spec defines it at the moment: whenever there’s a
> > “supported named property” (i.e., a value for n that is in range, as it
> > were) then an accessor property must be defined on the object, where the
> > getter calls item.
> >
> It seems you've done as Ian has and overloaded the term "range".
> Instead, I think you mean "a value for n that is < length" or "in the
> collection". And if I am right, I would rather that in the spec than
> an overloading of the term "range".

I don’t use the term “range” in the spec when talking about these
indexed properties, only in the above email.  The spec defines a term
“supported property index”:


If the word “range” is confusing in HTML5 there, then I suggest you
raise a bug to get it clarified.

> > It’s the “define a property on the object at the right time” bit that
> > could be tricky, though.
> >
> East for static collections; Constructing native ES arrays is roughly
> the same process; why not make a method for internal use by
> implementations based roughtly on that?
> <http://ecma262-5.com/ELS5_HTML.htm#Section_15.4.2>
> For live collections, it requires just writing precisely and
> succinctly what implementations do. Entirely possible, but not for me;
> not at this hour.

Yes, basically Web IDL says that as soon as an index becomes a supported
property index on the object, then a property needs to be defined on it.
Depending on what the collection is a collection of, it might be easy to
know when to do this, or it might not.

Cameron McCormack ≝ http://mcc.id.au/
Received on Tuesday, 2 November 2010 22:16:07 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:41 GMT