Iteration performance - tips and tricks

In this post I'll go over some common performance mistakes I see in code that iterates over data. I'll show you how to fix them and share some other tips and tricks to make iteration fast 🏃‍♂️‍➡️.

💡

Note: I mostly write TypeScript so I'm using JavaScript / TypeScript as examples but most of these ideas apply across different languages.

I’ll start by saying that I only recommend optimising your code for performance if any of the following are true:

You’re actually experiencing performance issues.
You’re dealing with large amounts of data (hundreds or thousands of items) or data of an unknown size.
You anticipate performance being a bottleneck due to domain-specific considerations (e.g. you know your users are on old, slow devices)

On the other hand, it doesn’t hurt to consider performance up-front and get into good habits. Good performance doesn’t always come at the cost of something else.

Only iterate once

Probably the most common mistake I see is code that performs several steps of iteration in a chain.

const birdNames = animals
  .filter(animal => animal.category === “bird”)
  .map(animal => animal.name);

Each iteration, map then filter, has a cost. You can reduce the cost by only iterating once.

const birdNames: string[] = [];

for (const animal of animals) {
  if (animal.category === “bird”)
    birdNames.push(animal.name);
  )
}

💡

I find myself using for-loops way more often than array methods when I’m writing code for performance. They’re a lot more flexible and, in many cases, make things easier to read. I recommend embracing them!

Consider order of operation

When you do need to iterate in multiple steps, the order in which you do things can have an effect on performance. Here we’re sorting and filtering a list:

const fishBySize = animals
  .sort((a, b) => a.size - b.size)
  .filter(animal => animal.category === “fish”);

The cost of filtering doesn’t change if it’s done before or after a sort, but sorting gets cheaper as the list gets smaller so we should filter first, then sort.

const fishBySize = animals
  .filter(animal => animal.category === “fish”)
  .sort((a, b) => a.size - b.size);

Exit early if possible

Finding a single item in a list is cheaper than finding multiple matching elements because you can leave early when you’ve found it.

Array methods like Array.find() and Array.some() do this automatically, so use them where possible instead if Array.filter().

// bad
const hasMonkeys = animals
  .filter(animal => animal.subCategory === “monkey”)
  .length > 0;

// good
const hasMonkeys = animals
  .some(animal => animal.subCategory === “monkey”);

Avoid nested loops with a lookup map

Sometimes we need to look up data in other places when iterating. This often happens when you have a list of IDs and need to know something about the full entity.

// takes a lost of animal IDs and returns the ones that belong to a mammal
function onlyMammalIds(ids: string[]) {
  const animals = getAnimals();
  const mammalIds: string[] = [];

  // outer loop
  for (const id of ids) {
    // inner loop
    const animal = animals.find(animal => animal.id === id);

    if(animal.category === “mammal”) {
      mammalIds.push(id);
    }
  }

  return mammalIds;
}

We have an outer loop where we iterate over ids. For each iteration there's an inner loop where we iterate over animals with animals.find(). This is called a nested loop.

💡

In "Big O" notation, a nested loop is described as O(n^2). I’m not a computer science graduate, but getting a basic understanding of Big O notation has helped me think more clearly about the performance of the code I write.

In this case we can avoid a nested loop by creating a Map of animals so they can be looked up by ID.

// takes a lost of animal IDs and returns the ones that belong to a mammal
function onlyMammalIds(ids: string[]) {
  const animals = getAnimals();

  // create an ID lookup
  const animalsById = new Map<Animal>();  
  for (const animal of animals) {
    animalsById.set(animal.id, animal);
  }

  const mammalIds: string[] = [];

  for (const id of ids) {
    // it's now cheaper to get the full entity
    const animal = animalsById.get(id);

    if(animal.category === “mammal”) {
      mammalIds.push(id);
    }
  }

  return mammalIds;
}

Although we incur the up-front cost of iterating over animals to produce the lookup Map, we save time in each iteration of ids which adds up quickly.

💡

In Big O Notation we've improved the performance from O(n^2) to O(n).

Use binary search on large, sorted lists

Binary search is a method for efficiently navigating a list of data where we know the sort order. It can make some operations much more efficient:

Checking if an item is in a sorted list (and finding its position)
Adding a new item to a sorted list
Deleting an item from a sorted list

Let's say we have a user "Alice" with a list of friends. The friends are sorted by ID. Alice is popular so she has a lot of friends and we wan't to know if she's friends with "Harry".

A sub-optimal way to do this is to iterate over Alice's friends until we find Harry.

const areAliceAndHarryFriends = !!alice.friends.find(
  friend => friend.id === harry.id
);

If Alice has 1000 friends and Harry is her 300th friend then we'd have to check 300 records before we find Harry. Even worse, if they're not friends we'd have to check all 1000 records.

Because Alice's list of friends are sorted in a predictable way we can do the same thing more efficiently in two steps:

Use binary search to find the index in Alice's friend list where Harry should be.
Check whether the item at that index is Harry.

Harry can only be in one position in a correctly sorted list. If he's there then he's Alice's friend, otherwise he's not.

import sortedIndexOf from "lodash/sortedIndexOf";

const harryIndex = sortedIndexOf(
  alice.friends,      // the sorted list
  harry,              // the item we're looking for 
  friend => friend.id // tell lodash the list is sorted by ID
);

if (alice.friends[harryIndex].id === harry.id) {
  console.log("Alice is friends with Harry");
} else {
  console.log("Alice is not friends with Harry");
}

How does binary search work?

Binary search is a "Divide and conquer" algorithm which works by continually cutting the list in half until it finds what it's looking for. It follows these steps:

Compare Harry to the item that's at the halfway point in the list (index 499).
Harry sorts lower than the item at 499 so cut halfway again and compare Harry that to the item at index 249.
Harry sorts higher than the item at 249 so cut halfway between 249 and 499
...continue until you find the exact position where Harry should sit

In the exact case where Alice has 1000 friends and Harry is the 300th, we can find Harry by only performing 10 comparisons, instead of the 300 we required with linear search.

💡

In Big O notation, binary search is an example of O(log n). This means, as the list grows, the search becomes exponentially more efficient in comparison to the size of the list.

Conclusion

I don't really have a conclusion to this one 🤔. If you found any of these helpful, or if you think I missed any obvious tips, let me know in the comments 👇