Async Generators in the Wild

This post is inspired by my answer to a question on StackOverflow on how to recursively read all the files in a directory in node. It shows how (recursive) async generator functions can be used in a real-world use case.

What are Async Iterators?

Async iterators are an upcoming feature of JavaScript that fills an important language gap: What happens when you combine an async function with a generator function?

Async functions return promises, generator functions return iterators, so what does an async generator function return? An iterator of promises!

An async generator function is declared with both the async and function* keywords:

const timeout = t => new Promise(r => setTimeout(r, t));

async function* foo() {
  yield 1
  await timeout(1000);
  yield 2;
  await timeout(1000);
  yield 3;
}

Sweet! So how can we use for something… useful?

Recursively Reading All Files From a Directory

Conceptually, all the files in a directory are iterable. However, in order to find them we’ll have to perform asynchronous operations on the file system. Unless we want to do all the work upfront, we’re better off with an async iterator that returns a result whenever the necessary async operations complete.

The implementation below works in node 11+ without any additional flags:

const { resolve } = require('path');
const { readdir, stat } = require('fs').promises;

async function* getFiles(rootPath) {
  const fileNames = await readdir(rootPath);
  for (const fileName of fileNames) {
    const path = resolve(rootPath, fileName);
    if ((await stat(path)).isDirectory()) {
      yield* getFiles(path);
    } else {
      yield path;
    }
  }
}

Similar to normal generator functions, we can use the yield* keyword to defer to another (async) generator, enabling an elegant, recursive implementation of recursive directory search.

It’s efficient in the sense that it only looks as far into the directory as you ask it to. Unlike observables, async iterators are pull-based and only run when the next method on the underlying async iterator is called.

Consuming Async Iterators

A somewhat awkward way of using async iterators, but using no special language features, is like this:

function forAwait(asyncIter, f) {
  asyncIter.next().then(({ done, value }) => {
    if (done) return;
    f(value);
    forAwait(asyncIter, f);
  });
}

forAwait(getFiles('.'), x => console.log(x));

As we can see, async iterators really are just objects with a next method that return promises.

While researching this post, I was surprised to find out that the above code will not exceed the call stack limit, suggesting that modern JavaScript engines support tail recursions!?

However, the idiomatic way of consuming async iterators is via the new for-await-of loop. The following code produces the same result as the code above:

for await (const x of getFiles('.')) console.log(x);

In order for for await to work in node, it has to be wrapped inside an async function, i.e. (async () => { /* ... */ })() aka IIAFE.

With this, we can upgrade our forAwait function to become sort of an asynchronous reduce:

async function reduce(asyncIter, f, init) {
  let res = init;
  for await (const x of asyncIter) {
    res = await f(res, x);
  }
  return res;
}

With our new reduce function, we can “consume” an entire async iterator and push the results into an array:

const toArray = iter => reduce(iter, (a, x) => (a.push(x), a), []);
const files = await toArray(getFiles('.')); 

This is similar to a traditional getFiles function that crawls an entire directory and returns all the files in a promise (see my original answer on stackoverflow).

Stream-like Processing

If we’re about to process the files in some way, why wait until we’ve found them all before we continue? With async iterables we can run code as soon as the first result comes in:

async function* toObj(filesIter) {
  for await (const name of filesIter) yield { name };
}

async function* addFileSize(objIter) {
  for await (const obj of objIter) {
    yield { ...obj, size: (await stat(obj.name)).size };
  }
}

async function* addIsJS(objIter) {
  for await (const obj of objIter) {
    yield { ...obj, isJS: obj.name.endsWith('.js') };
  }
}

for await (const { name, size, isJS } of addFileSize(addIsJS(toObj(getFiles('.'))))) {
  console.log(`${name} (${size} bytes) ${isJS ? 'JS' : 'NOJS'}`);
}

To be fair, the code above is not very real-worldy. Also, the a(b(c()))-style function wrapping is not very readable. Hopefully either a function bind operator or a pipe operator is going to make this more practial in the future.

For now, this is how async iterators can be processed and combined to form new async iterators. It is similar to how stream processing works. In fact, both node and browser streams (are about to) implement the async iterator interface so they can be consumed in this way.