March 26, 20267 min read

Top K Problems — Heaps, QuickSelect, and Bucket Sort

Three approaches to top-K problems with different time complexities and tradeoffs. Know when to use each one in interviews.

top k heap quickselect sorting interview

"Find the K largest elements." "Find the Kth smallest." "Find the K most frequent." These all look different but they're the same problem wearing different clothes. You need to efficiently select K items based on some ordering.

There are three approaches worth knowing. Sorting works but is never the answer interviewers want. Heaps are the safe pick. QuickSelect is the optimal pick. Bucket sort is the clever pick when the value range is bounded.

Approach 1: Min-Heap of Size K

Want the K largest? Maintain a min-heap of size K. For each element: if the heap has fewer than K items, push. If the current element is larger than the heap's minimum, pop the min and push the current element.

After processing all elements, the heap contains the K largest, with the Kth largest at the top.

Python

import heapq

def top_k_largest(nums, k):
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
        if len(heap) > k:
            heapq.heappop(heap)
    return sorted(heap, reverse=True)

Python's heapq is a min-heap. By keeping the heap at size K, the smallest of the K largest elements always sits at the top — and gets evicted if something bigger comes along.

Even simpler with nlargest:

def top_k_largest(nums, k):
    return heapq.nlargest(k, nums)

JavaScript

JavaScript has no built-in heap, so you either implement one or use a sorted insertion approach for small K:

function topKLargest(nums, k) {
  // Simple approach: sort (O(n log n))
  // For interviews, mention you'd use a heap
  const sorted = [...nums].sort((a, b) => b - a);
  return sorted.slice(0, k);
}

// With a proper MinHeap class:
function topKLargestHeap(nums, k) {
  const heap = new MinHeap();
  for (const num of nums) {
    heap.push(num);
    if (heap.size() > k) {
      heap.pop();
    }
  }
  return heap.toArray().sort((a, b) => b - a);
}

In a real interview, tell them you'd use a min-heap and explain the logic. Most interviewers won't ask you to implement the heap itself — but know how one works in case they do.

Why Min-Heap for Largest?

This confuses people. You want the K largest, so why a min-heap?

Because you need to quickly identify and remove the smallest of your K candidates. The min-heap's root is always the weakest element in your top-K set. When a new element arrives that's bigger, you swap out the weakest.

Conversely, for K smallest elements, use a max-heap.

Approach 2: QuickSelect

Average O(n). Based on the partition step from quicksort. Pick a pivot, partition the array so everything smaller is left and everything larger is right. If the pivot lands at position K, you're done. Otherwise, recurse into the relevant half.

Python

import random

def kth_largest(nums, k):
    k = len(nums) - k  # convert to kth smallest (0-indexed)

def quickselect(left, right):
        pivot_idx = random.randint(left, right)
        nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
        pivot = nums[right]

store = left
        for i in range(left, right):
            if nums[i] < pivot:
                nums[i], nums[store] = nums[store], nums[i]
                store += 1

nums[store], nums[right] = nums[right], nums[store]

if store == k:
            return nums[store]
        elif store < k:
            return quickselect(store + 1, right)
        else:
            return quickselect(left, store - 1)

return quickselect(0, len(nums) - 1)

JavaScript

function kthLargest(nums, k) {
  const target = nums.length - k;

function quickselect(left, right) {
    const pivotIdx = left + Math.floor(Math.random() * (right - left + 1));
    [nums[pivotIdx], nums[right]] = [nums[right], nums[pivotIdx]];
    const pivot = nums[right];

let store = left;
    for (let i = left; i < right; i++) {
      if (nums[i] < pivot) {
        [nums[i], nums[store]] = [nums[store], nums[i]];
        store++;
      }
    }
    [nums[store], nums[right]] = [nums[right], nums[store]];

if (store === target) return nums[store];
    if (store < target) return quickselect(store + 1, right);
    return quickselect(left, store - 1);
  }

return quickselect(0, nums.length - 1);
}

Random pivot is essential. Without it, worst case is O(n^2) on sorted/nearly-sorted input. Random pivot gives O(n) expected time.

QuickSelect modifies the input array. If that's not allowed, copy first.

Approach 3: Bucket Sort (for Frequency Problems)

"Top K frequent elements" is the classic bucket sort application. The trick: create buckets indexed by frequency. Bucket i holds all elements that appear i times. Then walk the buckets from highest to lowest, collecting elements until you have K.

Python

def top_k_frequent(nums, k):
    from collections import Counter

count = Counter(nums)
    # buckets[i] = list of elements with frequency i
    buckets = [[] for _ in range(len(nums) + 1)]

for num, freq in count.items():
        buckets[freq].append(num)

result = []
    for i in range(len(buckets) - 1, 0, -1):
        for num in buckets[i]:
            result.append(num)
            if len(result) == k:
                return result

return result

JavaScript

function topKFrequent(nums, k) {
  const count = new Map();
  for (const num of nums) {
    count.set(num, (count.get(num) || 0) + 1);
  }

const buckets = Array.from({ length: nums.length + 1 }, () => []);
  for (const [num, freq] of count) {
    buckets[freq].push(num);
  }

const result = [];
  for (let i = buckets.length - 1; i > 0; i--) {
    for (const num of buckets[i]) {
      result.push(num);
      if (result.length === k) return result;
    }
  }

return result;
}

O(n) time, O(n) space. No sorting, no heap. The constraint that makes this work: frequencies can't exceed n, so you can use array indices as frequency values.

When to Use Which

Approach	Time (avg)	Time (worst)	Space	Best When
Sort + slice	O(n log n)	O(n log n)	O(n)	K ≈ n, or simplicity matters
Min-heap of size K	O(n log k)	O(n log k)	O(k)	K << n, streaming data
QuickSelect	O(n)	O(n^2)	O(1)*	Kth element (not top K sorted)
Bucket sort	O(n)	O(n)	O(n)	Frequency-based, bounded range

*QuickSelect space is O(1) if you modify in-place; O(log n) for recursion stack on average.

The Streaming Data Angle

Heaps shine when data arrives as a stream and you don't have it all upfront. QuickSelect needs the full array. Bucket sort needs the full frequency distribution. But a heap of size K can process elements one at a time, evicting the weakest at each step.

If an interviewer says "what if the data doesn't fit in memory?" the answer is heap. Process chunks, maintain a heap of K elements, and you only need O(K) memory regardless of input size.

Classic Problems

Kth Largest Element in an Array — QuickSelect for O(n) average, or heap for O(n log k). This is LeetCode 215 and probably the most-asked top-K problem. Top K Frequent Elements — Bucket sort for O(n), or heap for O(n log k). LeetCode 347. K Closest Points to Origin — Compute distances, then it's a standard top-K. Use a max-heap of size K (evict the farthest of your K closest). LeetCode 973. Sort Characters by Frequency — Count frequencies, then bucket sort. LeetCode 451. Find K Pairs with Smallest Sums — Min-heap with lazy expansion. Push the first pair, then expand by incrementing one index at a time. LeetCode 373.

Common Mistakes

Using a max-heap for K largest. A max-heap gives you the overall largest first, but you'd have to extract K times — that's O(n + k log n) for heapify + extractions. A min-heap of size K is O(n log k), which is better when K << n. Forgetting the random pivot in QuickSelect. Deterministic pivot choice degrades to O(n^2) on adversarial input. Always randomize. Using sorting when they want better than O(n log n). If the interviewer asks for optimization after you sort, they want either heap or QuickSelect. Read the room. Not clarifying "top K" vs "Kth." "Find the Kth largest" needs one element. "Find the top K largest" needs K elements, possibly sorted. QuickSelect naturally solves the first; you need extra work for the second.

Practice these at CodeUp — the frequency-based variants are particularly popular in interviews because they combine data structure choice with algorithmic thinking.