Sorry about the weird title of this blog post. Not sure what else to call it.
I have a function that recursively traverses the file system. You can iterate over this function to do something with each found file on disk. Silly example:
for (const filePath of walker("/lots/of/files/here")) {
count += filePath.length;
}
The implementation looks like this:
function* walker(root) {
const files = fs.readdirSync(root);
for (const name of files) {
const filepath = path.join(root, name);
const isDirectory = fs.statSync(filepath).isDirectory();
if (isDirectory) {
yield* walker(filepath);
} else {
yield filepath;
}
}
}
But I wondered; is it faster to not use a generator function since there might an overhead in swapping from the generator to whatever callback does something with each yielded thing. A pure big-array function looks like this:
function walker(root) {
const files = fs.readdirSync(root);
const all = [];
for (const name of files) {
const filepath = path.join(root, name);
const isDirectory = fs.statSync(filepath).isDirectory();
if (isDirectory) {
all.push(...walker(filepath));
} else {
all.push(filepath);
}
}
return all;
}
It gets the same result/outcome.
It's hard to measure this but I pointed it to some large directory with many files and did something silly with each one just to make sure it does something:
const label = "generator";
console.time(label);
let count = 0;
for (const filePath of walker(SEARCH_ROOT)) {
count += filePath.length;
}
console.timeEnd(label);
const heapBytes = process.memoryUsage().heapUsed;
console.log(`HEAP: ${(heapBytes / 1024.0).toFixed(1)}KB`);
I ran it a bunch of times. After a while, the numbers settle and you get:
- Generator function: (median time) 1.74s
- Big array function: (median time) 1.73s
In other words, no speed difference.
Obviously building up a massive array in memory will increase the heap memory usage. Taking a snapshot at the end of the run and printing it each time, you can see that...
- Generator function: (median heap memory) 4.9MB
- Big array function: (median heap memory) 13.9MB
Conclusion
The potential swap overhead for a Node generator function is absolutely minuscule. At least in contexts similar to mine.
It's not unexpected that the generator function bounds less heap memory because it doesn't build up a big array at all.
Comments