Return to the blog's main page.

Lazy vs Eager Algorithms

Warning: I have not tested any code snippet below. Please let me know if you find a bug.

Introduction

Most algorithms have multiple valid implementations. For instance, in a binay tree problem, you have multiple ways of handling NULL nodes. I'm currently writing Beyond Cracking the Coding Interview (beyondctci.com), which means that my co-authors and I need to take a stance on what version of each algorithm to use. Ideally, we want to show the simplest version of each algorithm:

In the book, we don't claim that the version we show is "the best" - we say to use the one that works best for you. But showing one in the book is an implicit endorsement.

One particular decision that comes up again and again with recursive algorithms is choosing between the lazy version and the eager version of an algorithm.

What 'valid' means depends on the algorithm--we'll see plenty of examples. We'll also translate the concept of eager vs lazy to iterative algorithms.

Lazy vs Eager Tree Traversals

An eager tree traversal eagerly validates that the children are not NULL before passing them to the recursive function. A lazy tree traversal doesn't, so it needs to check if the current node is NULL before accessing it.

For instance, here is eager vs lazy preorder traversal:

class Node:
  def __init__(self, val, left=None, right=None):
    self.val = val
    self.left = left
    self.right = right

def preorder_traversal_eager(root):
  res = []

  # CANNOT be called with node == None
  def visit(node):
    res.append(node.val)
    if node.left:
      visit(node.left)
    if node.right:
      visit(node.right)

  if not root:
    return []
  visit(root)
  return res

def preorder_traversal_lazy(root):
  res = []

  # CAN be called with node == None
  def visit(node):
    if not node:
      return
    res.append(node.val)
    visit(node.left)
    visit(node.right)

  visit(root)
  return res

Both have the same runtime and space analysis. Even the constant factors probably don't change much, so it comes down to style preference. Which one do you prefer?

Lazy vs Eager graph DFS

An eager graph DFS eagerly checks that the neighbors are not already visited before passing them to the recursive function. A lazy graph DFS doesn't, so it needs to check if the current node is already visited.

# Returns all nodes reachable from start
def dfs_eager(adj_lists, start):
  res = []
  visited = set()

  def visit(node):
    res.append(node)
    for neighbor in adj_lists[node]:
      if neighbor not in visited:
        visited.add(neighbor)
        visit(neighbor)

  visited.add(start)
  visit(start)
  return res

def dfs_lazy(adj_lists, start):
  res = []

  def visit(node):
    if node in visited:
      return
    visited.add(node)
    res.append(node)
    for neighbor in adj_lists[node]:
      visit(neighbor)

  visit(root)
  return res

For a graph DFS, we can also do a mix between lazy and eager: we can eagerly check if nodes are already visited, and lazily mark them as visited:

def dfs_lazy(adj_lists, start):
  res = []
  visited = set()

  def visit(node):
    visited.add(node)
    res.append(node)
    for neighbor in adj_lists[node]:
      if neighbor not in visited:
        visit(neighbor)

  visit(root)
  return res

Again, they all have the same analysis. Which one do you prefer?

Lazy vs Eager grid algorithms

Consider the same DFS algorithm but on a grid of 0's and 1's. The 0's are walkable cells, the 1's are obstacles, and
walkable cells next to each other are connected. This time, we need to check that the neighbors are not out of bounds, which we can do lazily or greedily.

# Returns all cells reachable from (start_row, start_col).
def grid_dfs_eager(grid, start_row, start_col):
  nr, nc = len(grid), len(grid[0])
  res = []
  visited = set()

  def visit(row, col):
    res.append((row, col))
    for dir in ((-1, 0), (1, 0), (0, 1), (0, -1)):
      r, c = row + dir[0], col + dir[1]
      if 0 <= r < nr and 0 <= c < nc and grid[r][c] == 0 and (r, c) not in visited:
        visited.add((r, c))
        visit(r, c)

  # Assumes (start_row, start_col) is within bounds
  visited.add((start_row, start_col))
  visit(start_row, start_col)
  return res

def grid_dfs_lazy(grid, start_row, start_col):
  nr, nc = len(grid), len(grid[0])
  res = []
  visited = set()

  def visit(row, col):
    if row < 0 or row >= nr or col < 0 or col >= nc or grid[row][col] == 1:
      return
    if (row, col) in visited:
      return
    visited.add((row, col))
    res.append((row, col))
    for dir in ((-1, 0), (1, 0), (0, 1), (0, -1)):
      visit(row + dir[0], col + dir[1])

  visit(start_row, start_col)
  return res

Lazy vs Eager Memoization DP

In a lazy memoization DP (Dynamic Programming) algorithm, we call the recursive function for a subproblem without checking first if we have already computed that subproblem. In an eager algorithm, we only call the recursive function for subproblems that we still need to compute.

# Returns all cells reachable from (start_row, start_col).
def fibonacci_eager(n):
  memo = {}

  def fib_rec(i):
    if i <= 1:
      return 1
    if i-1 in memo:
      prev = memo[i-1]
    else:
      prevprev = fib_rec(i-1)
    if i-2 in memo:
      prevprev = memo[i-2]
    else:
      prev = fib_rec(i-2)
    memo[i] = prev + prevprev
    return memo[i]

  return fib_rec(n)

def fibonacci_lazy(n):
  memo = {}

  def fib_rec(i):
    if i <= 1:
      return 1
    if i in memo:
      return memo[i]
    memo[i] = fib_rec(i-1) + fib_rec(i-2)
    return memo[i]

  return fib_rec(n)

For memoization DP, I think lazy is cleaner and more conventional.

Lazy vs Eager Iterative Tree traversals

Consider a level-order traversal on a binary tree. A level-order traversal is an iterative algorithm that uses a queue data structure.

def level_order_traversal_eager(root):
  if not root:
    return []
  res = []
  Q = deque([root])
  while Q:
    node = Q.popleft()
    res.append(node.val)
    if node.left:
      Q.append(node.left)
    if node.right:
      Q.append(node.right)
  return res

def level_order_traversal_lazy(root):
  res = []
  Q = deque([root])
  while Q:
    node = Q.popleft()
    if not node:
      continue
    res.append(node.val)
    Q.append(node.left)
    Q.append(node.right)
  return res

Eager Graph BFS is better than lazy Graph BFS

This is the first exception where one is better than the other in terms of big O analysis. The lazy BFS allows adding already-visited nodes to the queue, while the eager one does not. We'll first look at the two versions, and then analyze them.

def graph_bfs_eager(adj_lists, start):
  res = []
  visited = set()
  visited.add(start)
  Q = deque([start])

  while Q:
    node = Q.popleft()
    res.append(node.val)
    for neighbor in adj_lists[node]:
      if neighbor not in visited:
        visited.add(neighbor)
        Q.append(neighbor)
  return res

def graph_bfs_lazy(adj_lists, start):
  res = []
  visited = set()
  Q = deque([start])

  while Q:
    node = Q.popleft()
    if node in visited:
      continue
    visited.add(node)
    res.append(node)
    for neighbor in adj_lists[node]:
      Q.append(neighbor)
  return res

It may come as a surprise that these two are not equivalent like all the other examples.

Let's say V is the number of nodes and E is the number of edges. To keep things simple, consider that the graph is connected, meaning that E is at least V-1 and at most O(V^2).

Both versions take O(E) time. The difference is in the space complexity: the eager version takes O(V) space because we never have the same node twice in the queue. The lazy version takes O(E) space because we allow the same nodes multiple times in the queue.

To see this, consider a complete graph:

Complete graph

  1. When we visit start, we add A, B, C, D, E to the queue. Now the queue is: [A, B, C, D, E]
  2. When we visit A, we add start, B, C, D, E to the queue. Now the queue is: [B, C, D, E, start, B, C, D, E]
  3. When we visit B, we add start, A, C, D, E to the queue. Now the queue is: [C, D, E, start, B, C, D, E, start, A, C, D, E]
  4. And so on.

By the time we finish popping the nodes added as neighbors of the start node, we've done V queue pops and V^2 queue appends, so the queue size is O(V^2).

So, why didn't this happen for other lazy algorithms we have seen?

Eager Dijkstra is better than Lazy Dijkstra, but harder to implement

I wrote extensively about different Dijktsra implementations in this Dijkstra blog post.

Dijkstra is similar to BFS, with the main difference that it uses a priority queue (PQ) instead of a queue to visit the nodes that are closer first (in terms of shortest paths).

In BFS, when a node is added to the queue, its distance from the starting node is already established and there is never a reason to add it again to the queue. In Dijkstra, when a node is added to the PQ, we might later find a shorter path while it is still in the PQ. When that happens, we can do two things:

Both versions take O(E*log V) time, but eager is more space efficient, analogously to eager BFS: O(V) for eager Dijkstra vs O(E) for lazy Dijkstra.

Here is lazy Dijkstra:

def dijkstra_lazy(adj_lists, start):
  dist = defaultdict(int)
  dist[start] = 0
  visited = set()
  PQ = [(0, start)]
  while PQ:
    _, node = heappop(PQ)  # Only need the node, not the distance.
    if node in visited:
      continue  # Not the first extraction.
    visited.add(node)
    for neighbor, weight in adj_lists[node]:
      if dist[node]+weight < dist[neighbor]:
        dist[neighbor] = dist[node]+weight
        # Neighbor may already be in the PQ; we add it anyway.
        heappush(PQ, (dist[neighbor], neighbor))
  return dist

Unfortunately, eager Dijkstra is not so easy to implement in Python because we are missing the decrease_key() operation in a heap (and Python does have a self-balancing BST data structure, which can also be used for eager Dijkstra). You can see a BST-based C++ implementation in my other blog post.

The dijkstra_lazy() algorithm above is more or less standard and it has been known as "lazy Dijkstra" for a while. However, it is possible to make an even lazier version which has the same runtime and space analysis (but likely bigger constant factors). The idea is that instead of only adding to the PQ the neighbors for whom we find an improved distance, we can simply add all of them, and discard duplicates once we extract them from the PQ:

def dijkstra_super_lazy(adj_lists, start):
  dist = defaultdict(int)
  dist[start] = 0
  PQ = [(0, s)]
  while PQ:
    d, node = heappop(PQ)
    if dist[node] != math.inf: continue
    dist[node] = d
    for neighbor, weight in adj_lists[node]:
      heappush(PQ, (dist[node]+weight, neighbor))
  return dist

So, Lazy or Eager?

We could keep looking at lazy vs eager algorithms, but I'll stop here. In aggregate, these are the pros and cons that I see:

Pros of lazy algorithms

Pros of eager algorithms

Evolution of the level-order traversal queue

My preference

Here are my personal preferences for coding interviews (not those of the other authors of 'Beyond Cracking the Coding Interview'):

Strong preferences:

Weak preferences:

In the book, we'll definitely mention that some algorithms can be implemented in a lazy or eager way (in way less detail than here), and that you should choose the one that feels easier to you. But, we still need to pick one to show in the problem solutions. One idea is trying to be consistent throughout (e.g., doing all tree and graph traversals in an eager way). If you have an opinion on which one is better, please reach out! I'd love to hear it.

Return to the blog's main page.