The Prefix Sum Technique: Simple Idea, Surprisingly Powerful
Master the prefix sum technique for coding interviews: range queries, subarray sums, 2D prefix sums, difference arrays, and product arrays.
The prefix sum is one of those techniques that feels too simple to be useful — until you realize it shows up everywhere. The core idea: precompute cumulative sums so that any range sum query becomes O(1). That's the whole technique. But the problems it solves are surprisingly varied.
If you've ever thought "I need the sum of a subarray, but recomputing it each time is too slow," prefix sums are the answer.
The Core Idea
Given an array nums, the prefix sum array prefix is defined as:
prefix[0] = 0
prefix[i] = nums[0] + nums[1] + ... + nums[i-1]
The sum of any subarray nums[left..right] (inclusive) is:
sum(left, right) = prefix[right + 1] - prefix[left]
That's it. Build the prefix array once in O(n), then answer any range sum query in O(1).
def build_prefix_sum(nums):
prefix = [0] * (len(nums) + 1)
for i in range(len(nums)):
prefix[i + 1] = prefix[i] + nums[i]
return prefix
Why does this work? prefix[right + 1] is the sum from index 0 to right. prefix[left] is the sum from index 0 to left - 1. Subtracting gives you the sum from left to right.
nums: [3, 1, 4, 1, 5, 9]
prefix: [0, 3, 4, 8, 9, 14, 23]
sum(2, 4) = prefix[5] - prefix[2] = 14 - 4 = 10
= 4 + 1 + 5 = 10 ✓
When You'll See Prefix Sums
Anytime a problem involves:
- Sum of a subarray
- Count of subarrays with a specific sum
- Range queries on an array
- Cumulative operations
And the brute force would recalculate the sum from scratch for each query, giving you O(n) per query when you need O(1).
Problem: Range Sum Query (Immutable)
Given an array, answer multiple range sum queries efficiently.
class NumArray:
def __init__(self, nums):
self.prefix = [0] * (len(nums) + 1)
for i in range(len(nums)):
self.prefix[i + 1] = self.prefix[i] + nums[i]
def sum_range(self, left, right):
return self.prefix[right + 1] - self.prefix[left]
O(n) preprocessing, O(1) per query. Without prefix sums, each query is O(n), and if you have Q queries, that's O(n * Q). With prefix sums, it's O(n + Q).
Problem: Subarray Sum Equals K
Given an array of integers and a target k, find the total number of contiguous subarrays whose sum equals k.
This is the classic prefix sum + hash map combination.
def subarray_sum(nums, k):
count = 0
current_sum = 0
prefix_counts = {0: 1} # sum -> how many times we've seen it
for num in nums:
current_sum += num
# if (current_sum - k) was a prefix sum we've seen before,
# then the subarray between then and now sums to k
count += prefix_counts.get(current_sum - k, 0)
prefix_counts[current_sum] = prefix_counts.get(current_sum, 0) + 1
return count
The insight: if prefix[j] - prefix[i] = k, then the subarray from index i to j-1 sums to k. So for each position j, we need to know how many previous prefix sums equal prefix[j] - k. A hash map tracks this.
O(n) time, O(n) space.
Why do we initialize prefix_counts = {0: 1}? Because a prefix sum of 0 exists before the array starts. If current_sum itself equals k, we need to count the subarray from the beginning — and current_sum - k = 0 must be in the map.
This pattern — prefix sum + hash map — is extremely common. Memorize it.
Variation: Subarray Sum Divisible by K
Count subarrays whose sum is divisible by k.
def subarrays_div_by_k(nums, k):
count = 0
current_sum = 0
remainder_counts = {0: 1}
for num in nums:
current_sum += num
remainder = current_sum % k
# handle negative remainders
remainder = (remainder + k) % k
count += remainder_counts.get(remainder, 0)
remainder_counts[remainder] = remainder_counts.get(remainder, 0) + 1
return count
Same idea, but track remainders instead of exact sums. Two prefix sums with the same remainder mod k means the subarray between them has a sum divisible by k.
The (remainder + k) % k handles negative numbers — in Python, -1 % 5 = 4 already, but this makes the intent explicit and works across languages.
Problem: Contiguous Array (Equal 0s and 1s)
Given a binary array, find the maximum length of a contiguous subarray with equal numbers of 0s and 1s.
The trick: convert 0s to -1s. Now the problem becomes "find the longest subarray with sum 0."
def find_max_length(nums):
current_sum = 0
max_len = 0
first_occurrence = {0: -1}
for i, num in enumerate(nums):
current_sum += 1 if num == 1 else -1
if current_sum in first_occurrence:
max_len = max(max_len, i - first_occurrence[current_sum])
else:
first_occurrence[current_sum] = i
return max_len
Instead of counting how many prefix sums equal a target, we track the first occurrence of each prefix sum. If we see the same prefix sum again, the subarray between the two positions sums to 0 — and we want the longest such subarray, so we use the earliest occurrence.
Problem: Product of Array Except Self
Given an array, return an array where each element is the product of all other elements. Do it without division and in O(n).
def product_except_self(nums):
n = len(nums)
result = [1] * n
# prefix products (left to right)
left_product = 1
for i in range(n):
result[i] = left_product
left_product *= nums[i]
# suffix products (right to left)
right_product = 1
for i in range(n - 1, -1, -1):
result[i] *= right_product
right_product *= nums[i]
return result
The idea: for each index i, the answer is the product of everything to its left times the product of everything to its right. Build prefix products and suffix products separately. The first pass fills result[i] with the left product. The second pass multiplies by the right product.
This is technically a prefix product (and suffix product), but the technique is identical to prefix sums.
Why can't we use division? Because the array might contain zeros. Division by zero is undefined, and handling it requires special cases that make the code messy.
2D Prefix Sums
The prefix sum idea extends naturally to two dimensions. Given a matrix, compute the sum of any rectangular subregion in O(1) after O(m * n) preprocessing.
Building the 2D Prefix Sum
def build_2d_prefix(matrix):
rows, cols = len(matrix), len(matrix[0])
prefix = [[0] * (cols + 1) for _ in range(rows + 1)]
for r in range(1, rows + 1):
for c in range(1, cols + 1):
prefix[r][c] = (matrix[r-1][c-1]
+ prefix[r-1][c]
+ prefix[r][c-1]
- prefix[r-1][c-1])
return prefix
prefix[r][c] = sum of all elements in the submatrix from (0,0) to (r-1, c-1). We use inclusion-exclusion to avoid double-counting.
Querying a Rectangle
Sum of elements in the rectangle from (r1, c1) to (r2, c2):
def region_sum(prefix, r1, c1, r2, c2):
return (prefix[r2+1][c2+1]
- prefix[r1][c2+1]
- prefix[r2+1][c1]
+ prefix[r1][c1])
Again, inclusion-exclusion. Draw it on paper: the full rectangle minus the top strip minus the left strip plus the corner (which was subtracted twice).
Problem: Range Sum Query 2D (Immutable)
class NumMatrix:
def __init__(self, matrix):
rows, cols = len(matrix), len(matrix[0])
self.prefix = [[0] * (cols + 1) for _ in range(rows + 1)]
for r in range(1, rows + 1):
for c in range(1, cols + 1):
self.prefix[r][c] = (matrix[r-1][c-1]
+ self.prefix[r-1][c]
+ self.prefix[r][c-1]
- self.prefix[r-1][c-1])
def sum_region(self, r1, c1, r2, c2):
return (self.prefix[r2+1][c2+1]
- self.prefix[r1][c2+1]
- self.prefix[r2+1][c1]
+ self.prefix[r1][c1])
Problem: Count Square Submatrices with All Ones
Given a binary matrix, count the total number of square submatrices containing all 1s.
While this can be solved with DP directly, the 2D prefix sum approach lets you check if any rectangle is all 1s in O(1) — useful in related problems.
def count_squares(matrix):
rows, cols = len(matrix), len(matrix[0])
dp = [[0] * cols for _ in range(rows)]
total = 0
for r in range(rows):
for c in range(cols):
if matrix[r][c] == 1:
if r == 0 or c == 0:
dp[r][c] = 1
else:
dp[r][c] = min(dp[r-1][c], dp[r][c-1], dp[r-1][c-1]) + 1
total += dp[r][c]
return total
This is actually a DP approach where dp[r][c] represents the side length of the largest square ending at (r, c). Each cell contributes dp[r][c] squares (of sizes 1x1, 2x2, ..., dp[r][c] x dp[r][c]).
The Difference Array Technique
The prefix sum's inverse: if prefix sums turn a sequence into cumulative sums, the difference array turns a cumulative sequence back into point changes.
Use case: You have multiple range update operations (add a value to all elements in a range), and you want to apply them all efficiently.Instead of updating every element in the range (O(n) per update), mark only the start and end of each update in a difference array (O(1) per update), then compute the prefix sum once to get the final array.
def apply_range_updates(length, updates):
"""
updates: list of (left, right, value) — add value to all elements in [left, right]
"""
diff = [0] * (length + 1)
for left, right, val in updates:
diff[left] += val
if right + 1 < length:
diff[right + 1] -= val
# build result from difference array
result = [0] * length
result[0] = diff[0]
for i in range(1, length):
result[i] = result[i-1] + diff[i]
return result
Each update is O(1): add val at left, subtract val at right + 1. The prefix sum of the difference array gives the actual values.
Problem: Car Pooling
Given a list of trips (numPassengers, from, to) and a vehicle capacity, determine if all trips can be completed.
def car_pooling(trips, capacity):
# maximum location value
max_loc = max(t[2] for t in trips)
diff = [0] * (max_loc + 1)
for passengers, start, end in trips:
diff[start] += passengers
diff[end] -= passengers # passengers get off at 'end'
current = 0
for d in diff:
current += d
if current > capacity:
return False
return True
Problem: Corporate Flight Bookings
There are n flights. Booking [first, last, seats] means seats are reserved on flights first through last (1-indexed). Return the total seats reserved on each flight.
def corp_flight_bookings(bookings, n):
diff = [0] * (n + 1)
for first, last, seats in bookings:
diff[first - 1] += seats
if last < n:
diff[last] -= seats
result = [0] * n
result[0] = diff[0]
for i in range(1, n):
result[i] = result[i-1] + diff[i]
return result
Problem: Find Pivot Index
Find the index where the sum of elements to the left equals the sum of elements to the right.
def pivot_index(nums):
total = sum(nums)
left_sum = 0
for i, num in enumerate(nums):
if left_sum == total - left_sum - num:
return i
left_sum += num
return -1
You don't even need to build a prefix array explicitly. Track the running left sum. The right sum is total - left_sum - nums[i]. If they're equal, that's the pivot.
Problem: Running Sum of 1D Array
Given an array, return the running sum.
def running_sum(nums):
for i in range(1, len(nums)):
nums[i] += nums[i - 1]
return nums
This is literally just building a prefix sum array. It's an easy problem, but it establishes the foundation.
Problem: Maximum Subarray Sum (Kadane's Algorithm)
While Kadane's algorithm isn't usually framed as a prefix sum technique, it's closely related. The maximum subarray sum ending at index i is related to prefix sums: the maximum subarray sum ending at i equals prefix[i+1] - min(prefix[0..i]).
def max_subarray(nums):
max_sum = nums[0]
current_sum = 0
for num in nums:
current_sum = max(num, current_sum + num)
max_sum = max(max_sum, current_sum)
return max_sum
Kadane's is O(n) time, O(1) space. The prefix sum version would be: compute prefix sums, then for each index find the minimum prefix sum before it. The difference is the maximum subarray sum. Both approaches are O(n), but Kadane's is simpler to implement.
Common Mistakes
Off-by-one in the prefix array. The prefix array has lengthn + 1, with prefix[0] = 0. The sum from index i to j is prefix[j+1] - prefix[i]. Mixing up j and j+1 is the most common bug.
Forgetting the initial value in the hash map. For "subarray sum equals K" problems, you must initialize the hash map with {0: 1}. Without this, you miss subarrays that start at index 0.
Not handling negative numbers. Prefix sums work fine with negative numbers (unlike sliding window). Don't try to use sliding window for subarray sum problems with negative values — use prefix sums instead.
Integer overflow in 2D prefix sums. In languages with fixed-size integers (C++, Java), the prefix sum can overflow for large matrices with large values. Use long/int64.
Confusing prefix sum with difference array. Prefix sum answers range queries. Difference array applies range updates. They're inverse operations. If you need both queries and updates, consider a segment tree or BIT.
Practice Problems
Basic prefix sum: Running Sum, Range Sum Query (Immutable), Find Pivot Index Prefix sum + hash map: Subarray Sum Equals K, Continuous Subarray Sum, Contiguous Array, Subarray Sums Divisible by K Product variant: Product of Array Except Self 2D prefix sum: Range Sum Query 2D, Matrix Block Sum, Count Square Submatrices Difference array: Car Pooling, Corporate Flight Bookings, Range AdditionWork through these on CodeUp — prefix sums are deceptively simple, but the hash map combination makes them powerful enough to solve problems that seem much harder than they are.