Iterators in JS 1.7: Cool stuff you can’t use yet

Posted in Development, JavaScript, Web

Today’s wonk porn is going to talk about iterators — one of JavaScript 1.7‘s cool new features — and how it relates to polymorphism. Grab a recent nightly build of Firefox 2.0 in order to play along.

Enumerables and Polymorphism

In Prototype, each Enumerable contains an _each method (note the underscore) that announces how it’s supposed to be iterated over. For instance, here’s the default _each method for arrays:

Array.prototype._each = function(iterator) {
  for (var i = 0; i < this.length; i++)
    iterator(this[i]);
};

All of the methods in Enumerable rely upon each, and each relies upon an _each method like this to do the actual iteration.

Thus, when you call each on an Enumerable, it invokes the _each method on the object without needing to know anything about the object itself. This is the heart of polymorphism: being able to use the same code on different types of things.

For instance, you can extend (“mixin”) Enumerable onto HTMLCollection, the array‐like object returned by DOM methods like getElementsByTagName and properties like childNodes:

HTMLCollection.prototype._each = function(iterator) {
  for (var i = 0; i < this.length; i++)
    iterator(this[i]);
};

Object.extend(HTMLCollection.prototype, Enumerable);

(Of course, it wouldn’t work in IE, but that’s another story.)

Generators & Iterators in JavaScript 1.7

A much‐needed overhaul to iteration is found in generators/iterators, a faithful translation of Python’s iterators to JavaScript.

Generators act as a sort of state machine. They’re just like ordinary functions, but they don’t return values — they yield them:

var countToTen = function() {
  for (var i = 1; i <= 10; i++)
    yield i;
};
var counter = countToTen();
counter.next();
// => 1
counter.next();
// => 2
counter.next();
// => 3
// [and so on, until...]
counter.next();
// => 10
counter.next();
// => [error StopIteration]

You initialize a Generator by invoking it and assigning it to a variable. Then you can ask the generator to spit out a value by calling its next method. Each time you call next, it’ll pick up right where it left off, yielding the next item in the series.

When there’s nothing left to yield, it throws a StopIteration exception to announce that it’s spent.

But wouldn’t it be annoying to have to manage this stuff in your code? Yes. That’s where iterators come in.

Iterators are simply generators that enumerate objects. We can use a for..in loop on the above generator and not have to worry about calling next or catching StopIteration:

var counter = countToTen();
for (var i in counter) { console.log(i); }

In JS1.7, a for..in loop knows to behave differently when it finds a generator on the right side. This is great news — for..in has been repaired in a way that preserves backwards‐compatibility.

Custom Iterators

It gets better. All objects will have a default iterator assigned to the non‐enumerable property __iterator__. When an object is iterated over in a for..in loop, it’ll use its __iterator__ property to report its iterable values — much like Prototype’s _each.

For instance, Array.prototype.__iterator__ knows to ignore all non‐numeric keys (meaning for..in will eventually be the proper way to enumerate arrays!), but you can replace any __iterator__ with your own custom version.

Let’s say we wanted to give HTMLCollections this treatment: we want for..in to ignore properties like length, item, and namedItem, and only feed us items that are part of the collection. Such an iterator would look something like this:

var DOMElementIterator = function(nodeList, keysOnly) {
  var node;
  for (var i = 0, len = nodeList.length; i < len; i++) {
    node = nodeList[i];
    if (node.nodeType != 3)
      yield keysOnly ? i : [i, node];
  }
  throw StopIteration;
};

The second argument (which I’ve named keysOnly in this example) is a boolean: if true, it’ll give you just the keys, but by default for..in will return two values. This lets you do destructuring assignment, another cool new thing in JS 1.7:

var nodes = document.getElementsByTagName('p');
for (var [key, value] in nodes) { /* some stuff */ }

To fetch an object’s iterator, you’d do Iterator(nodes). If you wanted a keys‐only iterator, you’d call Iterator(nodes, true).

So if we wanted to use our fancy custom iterator, we could write:

var nodes = document.getElementsByTagName('p');
for (var [key, value] in DOMElementIterator(nodes)) { /* some stuff */ }

But if you wanted to get creative, you’d replace the default iterator:

var nodes = document.getElementsByTagName('p');

nodes.__iterator__ = function(keysOnly) {
  return DOMElementIterator(this, keysOnly);
};

for (var [key, value] in nodes) { /* some stuff */ }

Or, if you wanted to get recklessly fancy, you could even override the iterator for all instances of an HTMLCollection:

HTMLElement.prototype.__iterator__ = function(keysOnly) {
  if (this == HTMLElement.prototype)
    return Object.prototype.__iterator__.apply(this, arguments);
  else return DOMElementIterator(this, keysOnly);
}

Why did I include a check for HTMLElement.prototype? Because if I need to do for (i in HTMLElement.prototype), it’ll look to HTMLElement.prototype.__iterator__ to get the enumerable properties. And I don’t want it to use my custom iterator, so I tell it to swap in the default Object iterator.

Of course, modifying the iterators of built‐ins is a bad idea. It’d be like changing what Array.push means. But JavaScript 2.0 will sport packages and other features for having code “islands” where you can do this sort of thing without screwing up other scripts that might rely on the default behavior.

…But it’s Firefox‐only, right?

Sadly, yes, at least for now. There’s no guarantee that any of the major browsers will adopt this feature before JavaScript 2.0 (though if anyone does it’ll be Opera, I think). And Firefox 2.0 isn’t even released yet. But once it is, Greasemonkey fiends and extension developers will have a new toy.

See Also

Comments