March 26, 202612 min read

The Prefix Sum Technique: Simple Idea, Surprisingly Powerful

Master the prefix sum technique for coding interviews: range queries, subarray sums, 2D prefix sums, difference arrays, and product arrays.

prefix-sum arrays interviews algorithms technique

The prefix sum is one of those techniques that feels too simple to be useful — until you realize it shows up everywhere. The core idea: precompute cumulative sums so that any range sum query becomes O(1). That's the whole technique. But the problems it solves are surprisingly varied.

If you've ever thought "I need the sum of a subarray, but recomputing it each time is too slow," prefix sums are the answer.

The Core Idea

Given an array nums, the prefix sum array prefix is defined as:

prefix[0] = 0
prefix[i] = nums[0] + nums[1] + ... + nums[i-1]

The sum of any subarray nums[left..right] (inclusive) is:

sum(left, right) = prefix[right + 1] - prefix[left]

That's it. Build the prefix array once in O(n), then answer any range sum query in O(1).

def build_prefix_sum(nums):
    prefix = [0] * (len(nums) + 1)
    for i in range(len(nums)):
        prefix[i + 1] = prefix[i] + nums[i]
    return prefix

Why does this work? prefix[right + 1] is the sum from index 0 to right. prefix[left] is the sum from index 0 to left - 1. Subtracting gives you the sum from left to right.

nums:    [3, 1, 4, 1, 5, 9]
prefix:  [0, 3, 4, 8, 9, 14, 23]

sum(2, 4) = prefix[5] - prefix[2] = 14 - 4 = 10
           = 4 + 1 + 5 = 10  ✓

When You'll See Prefix Sums

Anytime a problem involves:

Sum of a subarray

Count of subarrays with a specific sum

Range queries on an array

Cumulative operations

And the brute force would recalculate the sum from scratch for each query, giving you O(n) per query when you need O(1).

Problem: Range Sum Query (Immutable)

Given an array, answer multiple range sum queries efficiently.

class NumArray:
    def __init__(self, nums):
        self.prefix = [0] * (len(nums) + 1)
        for i in range(len(nums)):
            self.prefix[i + 1] = self.prefix[i] + nums[i]

def sum_range(self, left, right):
        return self.prefix[right + 1] - self.prefix[left]

O(n) preprocessing, O(1) per query. Without prefix sums, each query is O(n), and if you have Q queries, that's O(n * Q). With prefix sums, it's O(n + Q).

Problem: Subarray Sum Equals K

Given an array of integers and a target k, find the total number of contiguous subarrays whose sum equals k.

This is the classic prefix sum + hash map combination.

def subarray_sum(nums, k):
    count = 0
    current_sum = 0
    prefix_counts = {0: 1}  # sum -> how many times we've seen it

for num in nums:
        current_sum += num
        # if (current_sum - k) was a prefix sum we've seen before,
        # then the subarray between then and now sums to k
        count += prefix_counts.get(current_sum - k, 0)
        prefix_counts[current_sum] = prefix_counts.get(current_sum, 0) + 1

return count

The insight: if prefix[j] - prefix[i] = k, then the subarray from index i to j-1 sums to k. So for each position j, we need to know how many previous prefix sums equal prefix[j] - k. A hash map tracks this.

O(n) time, O(n) space.

Why do we initialize prefix_counts = {0: 1}? Because a prefix sum of 0 exists before the array starts. If current_sum itself equals k, we need to count the subarray from the beginning — and current_sum - k = 0 must be in the map.

This pattern — prefix sum + hash map — is extremely common. Memorize it.

Variation: Subarray Sum Divisible by K

Count subarrays whose sum is divisible by k.

def subarrays_div_by_k(nums, k):
    count = 0
    current_sum = 0
    remainder_counts = {0: 1}

for num in nums:
        current_sum += num
        remainder = current_sum % k
        # handle negative remainders
        remainder = (remainder + k) % k
        count += remainder_counts.get(remainder, 0)
        remainder_counts[remainder] = remainder_counts.get(remainder, 0) + 1

return count

Same idea, but track remainders instead of exact sums. Two prefix sums with the same remainder mod k means the subarray between them has a sum divisible by k.

The (remainder + k) % k handles negative numbers — in Python, -1 % 5 = 4 already, but this makes the intent explicit and works across languages.

Problem: Contiguous Array (Equal 0s and 1s)

Given a binary array, find the maximum length of a contiguous subarray with equal numbers of 0s and 1s.

The trick: convert 0s to -1s. Now the problem becomes "find the longest subarray with sum 0."

def find_max_length(nums):
    current_sum = 0
    max_len = 0
    first_occurrence = {0: -1}

for i, num in enumerate(nums):
        current_sum += 1 if num == 1 else -1

if current_sum in first_occurrence:
            max_len = max(max_len, i - first_occurrence[current_sum])
        else:
            first_occurrence[current_sum] = i

return max_len

Instead of counting how many prefix sums equal a target, we track the first occurrence of each prefix sum. If we see the same prefix sum again, the subarray between the two positions sums to 0 — and we want the longest such subarray, so we use the earliest occurrence.

Problem: Product of Array Except Self

Given an array, return an array where each element is the product of all other elements. Do it without division and in O(n).

def product_except_self(nums):
    n = len(nums)
    result = [1] * n

# prefix products (left to right)
    left_product = 1
    for i in range(n):
        result[i] = left_product
        left_product *= nums[i]

# suffix products (right to left)
    right_product = 1
    for i in range(n - 1, -1, -1):
        result[i] *= right_product
        right_product *= nums[i]

return result

The idea: for each index i, the answer is the product of everything to its left times the product of everything to its right. Build prefix products and suffix products separately. The first pass fills result[i] with the left product. The second pass multiplies by the right product.

This is technically a prefix product (and suffix product), but the technique is identical to prefix sums.

Why can't we use division? Because the array might contain zeros. Division by zero is undefined, and handling it requires special cases that make the code messy.

2D Prefix Sums

The prefix sum idea extends naturally to two dimensions. Given a matrix, compute the sum of any rectangular subregion in O(1) after O(m * n) preprocessing.

Building the 2D Prefix Sum

def build_2d_prefix(matrix):
    rows, cols = len(matrix), len(matrix[0])
    prefix = [[0] * (cols + 1) for _ in range(rows + 1)]

for r in range(1, rows + 1):
        for c in range(1, cols + 1):
            prefix[r][c] = (matrix[r-1][c-1]
                          + prefix[r-1][c]
                          + prefix[r][c-1]
                          - prefix[r-1][c-1])

return prefix

prefix[r][c] = sum of all elements in the submatrix from (0,0) to (r-1, c-1). We use inclusion-exclusion to avoid double-counting.

Querying a Rectangle

Sum of elements in the rectangle from (r1, c1) to (r2, c2):

def region_sum(prefix, r1, c1, r2, c2):
    return (prefix[r2+1][c2+1]
          - prefix[r1][c2+1]
          - prefix[r2+1][c1]
          + prefix[r1][c1])

Again, inclusion-exclusion. Draw it on paper: the full rectangle minus the top strip minus the left strip plus the corner (which was subtracted twice).

Problem: Range Sum Query 2D (Immutable)

class NumMatrix:
    def __init__(self, matrix):
        rows, cols = len(matrix), len(matrix[0])
        self.prefix = [[0] * (cols + 1) for _ in range(rows + 1)]
        for r in range(1, rows + 1):
            for c in range(1, cols + 1):
                self.prefix[r][c] = (matrix[r-1][c-1]
                                   + self.prefix[r-1][c]
                                   + self.prefix[r][c-1]
                                   - self.prefix[r-1][c-1])

def sum_region(self, r1, c1, r2, c2):
        return (self.prefix[r2+1][c2+1]
              - self.prefix[r1][c2+1]
              - self.prefix[r2+1][c1]
              + self.prefix[r1][c1])

Problem: Count Square Submatrices with All Ones

Given a binary matrix, count the total number of square submatrices containing all 1s.

While this can be solved with DP directly, the 2D prefix sum approach lets you check if any rectangle is all 1s in O(1) — useful in related problems.

def count_squares(matrix):
    rows, cols = len(matrix), len(matrix[0])
    dp = [[0] * cols for _ in range(rows)]
    total = 0

for r in range(rows):
        for c in range(cols):
            if matrix[r][c] == 1:
                if r == 0 or c == 0:
                    dp[r][c] = 1
                else:
                    dp[r][c] = min(dp[r-1][c], dp[r][c-1], dp[r-1][c-1]) + 1
                total += dp[r][c]

return total

This is actually a DP approach where dp[r][c] represents the side length of the largest square ending at (r, c). Each cell contributes dp[r][c] squares (of sizes 1x1, 2x2, ..., dp[r][c] x dp[r][c]).

The Difference Array Technique

The prefix sum's inverse: if prefix sums turn a sequence into cumulative sums, the difference array turns a cumulative sequence back into point changes.

Use case: You have multiple range update operations (add a value to all elements in a range), and you want to apply them all efficiently.

Instead of updating every element in the range (O(n) per update), mark only the start and end of each update in a difference array (O(1) per update), then compute the prefix sum once to get the final array.

def apply_range_updates(length, updates):
    """
    updates: list of (left, right, value) — add value to all elements in [left, right]
    """
    diff = [0] * (length + 1)

for left, right, val in updates:
        diff[left] += val
        if right + 1 < length:
            diff[right + 1] -= val

# build result from difference array
    result = [0] * length
    result[0] = diff[0]
    for i in range(1, length):
        result[i] = result[i-1] + diff[i]

return result

Each update is O(1): add val at left, subtract val at right + 1. The prefix sum of the difference array gives the actual values.

Problem: Car Pooling

Given a list of trips (numPassengers, from, to) and a vehicle capacity, determine if all trips can be completed.

def car_pooling(trips, capacity):
    # maximum location value
    max_loc = max(t[2] for t in trips)
    diff = [0] * (max_loc + 1)

for passengers, start, end in trips:
        diff[start] += passengers
        diff[end] -= passengers  # passengers get off at 'end'

current = 0
    for d in diff:
        current += d
        if current > capacity:
            return False

return True

Problem: Corporate Flight Bookings

There are n flights. Booking [first, last, seats] means seats are reserved on flights first through last (1-indexed). Return the total seats reserved on each flight.

def corp_flight_bookings(bookings, n):
    diff = [0] * (n + 1)
    for first, last, seats in bookings:
        diff[first - 1] += seats
        if last < n:
            diff[last] -= seats

result = [0] * n
    result[0] = diff[0]
    for i in range(1, n):
        result[i] = result[i-1] + diff[i]

return result

Problem: Find Pivot Index

Find the index where the sum of elements to the left equals the sum of elements to the right.

def pivot_index(nums):
    total = sum(nums)
    left_sum = 0

for i, num in enumerate(nums):
        if left_sum == total - left_sum - num:
            return i
        left_sum += num

return -1

You don't even need to build a prefix array explicitly. Track the running left sum. The right sum is total - left_sum - nums[i]. If they're equal, that's the pivot.

Problem: Running Sum of 1D Array

Given an array, return the running sum.

def running_sum(nums):
    for i in range(1, len(nums)):
        nums[i] += nums[i - 1]
    return nums

This is literally just building a prefix sum array. It's an easy problem, but it establishes the foundation.

Problem: Maximum Subarray Sum (Kadane's Algorithm)

While Kadane's algorithm isn't usually framed as a prefix sum technique, it's closely related. The maximum subarray sum ending at index i is related to prefix sums: the maximum subarray sum ending at i equals prefix[i+1] - min(prefix[0..i]).

def max_subarray(nums):
    max_sum = nums[0]
    current_sum = 0

for num in nums:
        current_sum = max(num, current_sum + num)
        max_sum = max(max_sum, current_sum)

return max_sum

Kadane's is O(n) time, O(1) space. The prefix sum version would be: compute prefix sums, then for each index find the minimum prefix sum before it. The difference is the maximum subarray sum. Both approaches are O(n), but Kadane's is simpler to implement.

Common Mistakes

Off-by-one in the prefix array. The prefix array has length n + 1, with prefix[0] = 0. The sum from index i to j is prefix[j+1] - prefix[i]. Mixing up j and j+1 is the most common bug. Forgetting the initial value in the hash map. For "subarray sum equals K" problems, you must initialize the hash map with {0: 1}. Without this, you miss subarrays that start at index 0. Not handling negative numbers. Prefix sums work fine with negative numbers (unlike sliding window). Don't try to use sliding window for subarray sum problems with negative values — use prefix sums instead. Integer overflow in 2D prefix sums. In languages with fixed-size integers (C++, Java), the prefix sum can overflow for large matrices with large values. Use long/int64. Confusing prefix sum with difference array. Prefix sum answers range queries. Difference array applies range updates. They're inverse operations. If you need both queries and updates, consider a segment tree or BIT.

Practice Problems

Basic prefix sum: Running Sum, Range Sum Query (Immutable), Find Pivot Index Prefix sum + hash map: Subarray Sum Equals K, Continuous Subarray Sum, Contiguous Array, Subarray Sums Divisible by K Product variant: Product of Array Except Self 2D prefix sum: Range Sum Query 2D, Matrix Block Sum, Count Square Submatrices Difference array: Car Pooling, Corporate Flight Bookings, Range Addition

Work through these on CodeUp — prefix sums are deceptively simple, but the hash map combination makes them powerful enough to solve problems that seem much harder than they are.