Top K Problems — Heaps, QuickSelect, and Bucket Sort
Three approaches to top-K problems with different time complexities and tradeoffs. Know when to use each one in interviews.
"Find the K largest elements." "Find the Kth smallest." "Find the K most frequent." These all look different but they're the same problem wearing different clothes. You need to efficiently select K items based on some ordering.
There are three approaches worth knowing. Sorting works but is never the answer interviewers want. Heaps are the safe pick. QuickSelect is the optimal pick. Bucket sort is the clever pick when the value range is bounded.
Approach 1: Min-Heap of Size K
Want the K largest? Maintain a min-heap of size K. For each element: if the heap has fewer than K items, push. If the current element is larger than the heap's minimum, pop the min and push the current element.
After processing all elements, the heap contains the K largest, with the Kth largest at the top.
Python
import heapq
def top_k_largest(nums, k):
heap = []
for num in nums:
heapq.heappush(heap, num)
if len(heap) > k:
heapq.heappop(heap)
return sorted(heap, reverse=True)
Python's heapq is a min-heap. By keeping the heap at size K, the smallest of the K largest elements always sits at the top — and gets evicted if something bigger comes along.
Even simpler with nlargest:
def top_k_largest(nums, k):
return heapq.nlargest(k, nums)
JavaScript
JavaScript has no built-in heap, so you either implement one or use a sorted insertion approach for small K:
function topKLargest(nums, k) {
// Simple approach: sort (O(n log n))
// For interviews, mention you'd use a heap
const sorted = [...nums].sort((a, b) => b - a);
return sorted.slice(0, k);
}
// With a proper MinHeap class:
function topKLargestHeap(nums, k) {
const heap = new MinHeap();
for (const num of nums) {
heap.push(num);
if (heap.size() > k) {
heap.pop();
}
}
return heap.toArray().sort((a, b) => b - a);
}
In a real interview, tell them you'd use a min-heap and explain the logic. Most interviewers won't ask you to implement the heap itself — but know how one works in case they do.
Why Min-Heap for Largest?
This confuses people. You want the K largest, so why a min-heap?
Because you need to quickly identify and remove the smallest of your K candidates. The min-heap's root is always the weakest element in your top-K set. When a new element arrives that's bigger, you swap out the weakest.
Conversely, for K smallest elements, use a max-heap.
Approach 2: QuickSelect
Average O(n). Based on the partition step from quicksort. Pick a pivot, partition the array so everything smaller is left and everything larger is right. If the pivot lands at position K, you're done. Otherwise, recurse into the relevant half.
Python
import random
def kth_largest(nums, k):
k = len(nums) - k # convert to kth smallest (0-indexed)
def quickselect(left, right):
pivot_idx = random.randint(left, right)
nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
pivot = nums[right]
store = left
for i in range(left, right):
if nums[i] < pivot:
nums[i], nums[store] = nums[store], nums[i]
store += 1
nums[store], nums[right] = nums[right], nums[store]
if store == k:
return nums[store]
elif store < k:
return quickselect(store + 1, right)
else:
return quickselect(left, store - 1)
return quickselect(0, len(nums) - 1)
JavaScript
function kthLargest(nums, k) {
const target = nums.length - k;
function quickselect(left, right) {
const pivotIdx = left + Math.floor(Math.random() * (right - left + 1));
[nums[pivotIdx], nums[right]] = [nums[right], nums[pivotIdx]];
const pivot = nums[right];
let store = left;
for (let i = left; i < right; i++) {
if (nums[i] < pivot) {
[nums[i], nums[store]] = [nums[store], nums[i]];
store++;
}
}
[nums[store], nums[right]] = [nums[right], nums[store]];
if (store === target) return nums[store];
if (store < target) return quickselect(store + 1, right);
return quickselect(left, store - 1);
}
return quickselect(0, nums.length - 1);
}
Random pivot is essential. Without it, worst case is O(n^2) on sorted/nearly-sorted input. Random pivot gives O(n) expected time.
QuickSelect modifies the input array. If that's not allowed, copy first.
Approach 3: Bucket Sort (for Frequency Problems)
"Top K frequent elements" is the classic bucket sort application. The trick: create buckets indexed by frequency. Bucket i holds all elements that appear i times. Then walk the buckets from highest to lowest, collecting elements until you have K.
Python
def top_k_frequent(nums, k):
from collections import Counter
count = Counter(nums)
# buckets[i] = list of elements with frequency i
buckets = [[] for _ in range(len(nums) + 1)]
for num, freq in count.items():
buckets[freq].append(num)
result = []
for i in range(len(buckets) - 1, 0, -1):
for num in buckets[i]:
result.append(num)
if len(result) == k:
return result
return result
JavaScript
function topKFrequent(nums, k) {
const count = new Map();
for (const num of nums) {
count.set(num, (count.get(num) || 0) + 1);
}
const buckets = Array.from({ length: nums.length + 1 }, () => []);
for (const [num, freq] of count) {
buckets[freq].push(num);
}
const result = [];
for (let i = buckets.length - 1; i > 0; i--) {
for (const num of buckets[i]) {
result.push(num);
if (result.length === k) return result;
}
}
return result;
}
O(n) time, O(n) space. No sorting, no heap. The constraint that makes this work: frequencies can't exceed n, so you can use array indices as frequency values.
When to Use Which
| Approach | Time (avg) | Time (worst) | Space | Best When |
|---|---|---|---|---|
| Sort + slice | O(n log n) | O(n log n) | O(n) | K ≈ n, or simplicity matters |
| Min-heap of size K | O(n log k) | O(n log k) | O(k) | K << n, streaming data |
| QuickSelect | O(n) | O(n^2) | O(1)* | Kth element (not top K sorted) |
| Bucket sort | O(n) | O(n) | O(n) | Frequency-based, bounded range |
The Streaming Data Angle
Heaps shine when data arrives as a stream and you don't have it all upfront. QuickSelect needs the full array. Bucket sort needs the full frequency distribution. But a heap of size K can process elements one at a time, evicting the weakest at each step.
If an interviewer says "what if the data doesn't fit in memory?" the answer is heap. Process chunks, maintain a heap of K elements, and you only need O(K) memory regardless of input size.
Classic Problems
Kth Largest Element in an Array — QuickSelect for O(n) average, or heap for O(n log k). This is LeetCode 215 and probably the most-asked top-K problem. Top K Frequent Elements — Bucket sort for O(n), or heap for O(n log k). LeetCode 347. K Closest Points to Origin — Compute distances, then it's a standard top-K. Use a max-heap of size K (evict the farthest of your K closest). LeetCode 973. Sort Characters by Frequency — Count frequencies, then bucket sort. LeetCode 451. Find K Pairs with Smallest Sums — Min-heap with lazy expansion. Push the first pair, then expand by incrementing one index at a time. LeetCode 373.Common Mistakes
Using a max-heap for K largest. A max-heap gives you the overall largest first, but you'd have to extract K times — that's O(n + k log n) for heapify + extractions. A min-heap of size K is O(n log k), which is better when K << n. Forgetting the random pivot in QuickSelect. Deterministic pivot choice degrades to O(n^2) on adversarial input. Always randomize. Using sorting when they want better than O(n log n). If the interviewer asks for optimization after you sort, they want either heap or QuickSelect. Read the room. Not clarifying "top K" vs "Kth." "Find the Kth largest" needs one element. "Find the top K largest" needs K elements, possibly sorted. QuickSelect naturally solves the first; you need extra work for the second.Practice these at CodeUp — the frequency-based variants are particularly popular in interviews because they combine data structure choice with algorithmic thinking.