March 26, 20266 min read

Binary Search: More Than Just Finding a Number

How binary search works, why off-by-one errors are the real enemy, and the surprising places binary search shows up beyond sorted arrays.

binary-search algorithms interviews searching
Ad 336x280

Binary search has a reputation as a simple algorithm. The idea is dead simple — cut the search space in half each step. But writing a correct implementation from scratch, without bugs, on the first try? Surprisingly hard. There's a famous stat that only about 10% of professional programmers can write a bug-free binary search. The off-by-one errors are vicious.

The Core Idea

You have a sorted array. You're looking for a target value. Instead of scanning left to right (O(n)), you check the middle element. If it matches, great. If the target is larger, throw away the left half. If smaller, throw away the right half. Repeat.

Each step eliminates half the remaining elements. For an array of 1,000,000 elements, that's about 20 steps. For a billion elements, about 30.

Getting the Implementation Right

Here's a correct binary search in Python:

def binary_search(arr, target):
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2  # avoids integer overflow
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1
    return -1

The tricky parts, and where bugs creep in:

lo <= hi vs lo < hi. Using <= means we check the last remaining element. Using < can miss it. The correct choice depends on how you update lo and hi, but <= with mid + 1 / mid - 1 updates is the most standard pattern. mid = lo + (hi - lo) // 2 vs mid = (lo + hi) // 2. They're mathematically equivalent, but (lo + hi) can overflow in languages with fixed-size integers (C, C++, Java). Python doesn't have this problem, but it's a good habit. Updating lo and hi. It must be lo = mid + 1 and hi = mid - 1, not lo = mid or hi = mid. If you use lo = mid, you can get infinite loops when lo == mid (which happens when lo + 1 == hi).

Variations That Come Up Constantly

Lower Bound (First Occurrence)

What if there are duplicates and you want the first position where target could be inserted to keep the array sorted?

def lower_bound(arr, target):
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] < target:
            lo = mid + 1
        else:
            hi = mid
    return lo

Notice the differences: hi starts at len(arr) (not len(arr) - 1), the condition is lo < hi (not lo <= hi), and we set hi = mid (not mid - 1). These aren't arbitrary — each change is necessary for correctness. Mess up any one of them and you get wrong answers on edge cases.

Upper Bound (First Element Greater Than Target)

def upper_bound(arr, target):
    lo, hi = 0, len(arr)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] <= target:
            lo = mid + 1
        else:
            hi = mid
    return lo

The only difference from lower bound: <= instead of < in the comparison. Subtle, but it shifts the boundary by exactly one position.

Search in Rotated Sorted Array

A classic interview problem. The array [4, 5, 6, 7, 0, 1, 2] was sorted, then rotated. You can still binary search it in O(log n) — you just need to figure out which half is sorted at each step and whether the target falls in that half.

def search_rotated(arr, target):
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if arr[mid] == target:
            return mid
        # Left half is sorted
        if arr[lo] <= arr[mid]:
            if arr[lo] <= target < arr[mid]:
                hi = mid - 1
            else:
                lo = mid + 1
        # Right half is sorted
        else:
            if arr[mid] < target <= arr[hi]:
                lo = mid + 1
            else:
                hi = mid - 1
    return -1

This is one of those problems where getting the boundary conditions right requires careful thought. Each <= vs < is intentional.

Binary Search Beyond Sorted Arrays

Here's where it gets interesting. Binary search works whenever you have a monotonic condition — something that's false for all values below a threshold and true for all values above it (or vice versa).

Binary Search on the Answer

Problem: "What's the minimum capacity needed to ship all packages within D days?"

The answer is some number between 1 and sum(weights). For any given capacity, you can check in O(n) whether it's enough. And the feasibility is monotonic — if capacity C works, then C+1 also works.

def min_capacity(weights, days):
    lo, hi = max(weights), sum(weights)
    while lo < hi:
        mid = lo + (hi - lo) // 2
        if can_ship(weights, days, mid):
            hi = mid
        else:
            lo = mid + 1
    return lo

This pattern — binary search on the answer space — shows up everywhere. Minimum speed to eat bananas in time, maximum minimum distance between cows, splitting arrays with minimum largest sum. The key insight is recognizing that the answer satisfies a monotonic predicate.

Other Unexpected Places

  • Git bisect uses binary search to find the commit that introduced a bug
  • Square root calculation — binary search between 0 and n for the value where x*x = n
  • Finding peaks in arrays — binary search based on whether you're on an ascending or descending slope
  • Median of two sorted arrays — one of the hardest binary search problems, searches for the correct partition point

Why Off-by-One Errors Are the Real Problem

The algorithm is intuitive. Everyone understands "cut in half, check, repeat." But the boundary handling has dozens of valid combinations (< vs <=, mid vs mid-1, inclusive vs exclusive endpoints), and mixing patterns from different conventions produces bugs that only manifest on specific edge cases — empty arrays, single elements, targets at the boundaries.

My advice: pick one convention, internalize it, and stick with it. The lo <= hi with mid+1/mid-1 convention for exact search, and the lo < hi with hi = mid convention for bound-finding, cover virtually all cases.

Practice writing binary search from scratch until you can do it without hesitation. CodeUp has a progression of binary search problems that starts with basic sorted array search and builds up to answer-space binary search and multi-dimensional variants. Getting these right under time pressure is a matter of muscle memory, not intelligence.

Ad 728x90