March 27, 202611 min read

A* Search: The Pathfinding Algorithm Behind Every Game and GPS

Learn the A* search algorithm step by step — heuristics, implementation in Python, grid pathfinding, optimality proofs, and comparison to Dijkstra and BFS.

a-star pathfinding algorithms graph game-development

Every time you ask Google Maps for directions, every time a game character navigates around obstacles, every time a robot plans a path through a warehouse — there's a good chance A* (pronounced "A-star") is doing the work. It's one of the most widely used algorithms in computer science, and once you understand it, you'll see pathfinding problems everywhere.

A is a graph search algorithm that finds the shortest path between a start node and a goal node. What makes it special is that it's smart about where it looks — instead of exploring every possible direction equally (like BFS) or always going to the nearest unvisited node (like Dijkstra's), A uses a heuristic to estimate which direction is most promising and explores that direction first.

The result: A* finds the optimal path while exploring far fewer nodes than its alternatives.

Graph Search Recap

Before diving into A*, let's recall what we're working with.

A graph is a set of nodes connected by edges, where each edge may have a cost (weight). Pathfinding means finding the lowest-cost path from a start node to a goal node.

BFS (Breadth-First Search) explores nodes level by level outward from the start. It finds the shortest path in unweighted graphs but doesn't consider edge weights. In a weighted graph, it might find a suboptimal path. Dijkstra's algorithm handles weighted graphs by always expanding the node with the lowest known cost from the start. It's optimal but it explores in every direction equally — it doesn't know where the goal is, so it radiates outward like a ripple in a pond. A* improves on Dijkstra by adding a heuristic: an estimate of the remaining distance to the goal. This lets it focus its search toward the goal, drastically reducing the number of nodes explored.

The Core Idea: f(n) = g(n) + h(n)

A* assigns a priority to each node using three values:

g(n) — the actual cost from the start to node n (known exactly)
h(n) — the estimated cost from node n to the goal (the heuristic)
f(n) = g(n) + h(n) — the estimated total cost of the path through n

At each step, A* expands the node with the lowest f(n). That's it. The entire algorithm is "always expand the most promising node, where 'promising' means lowest estimated total cost."

The heuristic h(n) is what makes A powerful. If the heuristic is good (close to the actual distance but never overestimates), A finds the optimal path while exploring a small fraction of the graph.

The Algorithm Step by Step

1. Create an open set (priority queue) and add the start node with f = h(start)

Create a closed set (empty)
While the open set is not empty:
   a. Pop the node with the lowest f value — call it 'current'
   b. If current is the goal, reconstruct and return the path
   c. Add current to the closed set
   d. For each neighbor of current:
      - If neighbor is in the closed set, skip it
      - Calculate tentative g = g(current) + cost(current, neighbor)
      - If neighbor is not in the open set, or tentative g is lower than neighbor's current g:
        - Update neighbor's g to tentative g
        - Update neighbor's f = g + h(neighbor)
        - Set neighbor's parent to current
        - Add neighbor to the open set (if not already there)

If the open set is empty and we never reached the goal, no path exists

The closed set prevents re-exploring nodes. The parent pointers let us reconstruct the path once we reach the goal.

Heuristic Functions

The heuristic is the heart of A. It must be admissible (never overestimates the true cost) for A to find the optimal path.

For grid-based pathfinding, the two most common heuristics are:

Manhattan distance — used when movement is restricted to 4 directions (up, down, left, right):

def manhattan(a, b):
    return abs(a[0] - b[0]) + abs(a[1] - b[1])

Euclidean distance — used when movement can be in any direction (including diagonals):

import math

def euclidean(a, b):
    return math.sqrt((a[0] - b[0])  2 + (a[1] - b[1])  2)

Chebyshev distance — used when diagonal movement costs the same as cardinal movement:

def chebyshev(a, b):
    return max(abs(a[0] - b[0]), abs(a[1] - b[1]))

Why admissibility matters: If the heuristic overestimates, A* might skip the actual optimal path because it looks more expensive than it really is. The algorithm would expand a suboptimal path first, thinking it's cheaper. An admissible heuristic guarantees this can't happen. A more accurate heuristic is better (fewer nodes explored), as long as it stays admissible. If h(n) = 0 for all nodes, A degrades to Dijkstra's algorithm (no heuristic guidance). If h(n) equals the exact remaining cost, A goes straight to the goal without exploring any unnecessary nodes.

Implementation in Python

import heapq

def a_star(grid, start, goal):
    """
    A* pathfinding on a 2D grid.
    grid[row][col] = 0 means walkable, 1 means wall.
    start and goal are (row, col) tuples.
    Returns the path as a list of (row, col) positions, or None if no path exists.
    """
    rows, cols = len(grid), len(grid[0])

# 4-directional movement
    directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

def heuristic(a, b):
        return abs(a[0] - b[0]) + abs(a[1] - b[1])  # Manhattan distance

# Priority queue: (f_score, counter, node)
    # counter breaks ties to avoid comparing tuples
    counter = 0
    open_set = [(heuristic(start, goal), counter, start)]
    came_from = {}
    g_score = {start: 0}
    closed_set = set()

while open_set:
        f, _, current = heapq.heappop(open_set)

if current == goal:
            # Reconstruct path
            path = []
            while current in came_from:
                path.append(current)
                current = came_from[current]
            path.append(start)
            return path[::-1]

if current in closed_set:
            continue
        closed_set.add(current)

for dr, dc in directions:
            neighbor = (current[0] + dr, current[1] + dc)

# Bounds check
            if not (0 <= neighbor[0] < rows and 0 <= neighbor[1] < cols):
                continue
            # Wall check
            if grid[neighbor[0]][neighbor[1]] == 1:
                continue
            # Already processed
            if neighbor in closed_set:
                continue

tentative_g = g_score[current] + 1  # cost of 1 per step

if tentative_g < g_score.get(neighbor, float('inf')):
                came_from[neighbor] = current
                g_score[neighbor] = tentative_g
                f_score = tentative_g + heuristic(neighbor, goal)
                counter += 1
                heapq.heappush(open_set, (f_score, counter, neighbor))

return None  # No path found

# Example usage
grid = [
    [0, 0, 0, 0, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 0, 1, 0],
    [0, 1, 0, 0, 0],
    [0, 0, 0, 0, 0],
]

path = a_star(grid, (0, 0), (4, 4))
print(path)
# [(0, 0), (1, 0), (2, 0), (2, 1), (2, 2), (3, 2), (3, 3), (4, 3), (4, 4)]

A few implementation details worth noting:

The counter for tie-breaking. When two nodes have the same f score, Python's heapq would try to compare the tuples, which fails if the nodes aren't comparable. The counter ensures a unique ordering. Checking closed_set after popping. A node might be added to the open set multiple times (with different g scores). We only process it when it's first popped (which will be with the best g score) and skip it on subsequent pops. g_score dictionary. We use a dictionary instead of a 2D array for flexibility, but in performance-critical code, a 2D array is faster.

Visualizing A* on a Grid

Here's what A* looks like compared to Dijkstra on a grid with an obstacle:

Grid (S = start, G = goal, # = wall):

S . . . .
. # # # .
. . . # .
. # . . .
. . . . G

Dijkstra explores (o = explored):
S o o o o
o # # # o
o o o # o
o # o o o
o o o o G
→ Explores 17 cells

A* with Manhattan heuristic explores:
S o . . .
o # # # .
o o o # .
. # o o o
. . . o G
→ Explores 11 cells

A doesn't explore cells that are in the wrong direction. Dijkstra radiates outward uniformly. A is biased toward the goal.

The difference becomes dramatic on large grids. On a 1000x1000 grid with a clear path, Dijkstra might explore hundreds of thousands of cells while A* explores a fraction of that.

Optimality Guarantees

A* is optimal (finds the shortest path) if:

The heuristic is admissible — never overestimates the true cost

The heuristic is consistent (monotone) — for every node n and neighbor n', h(n) <= cost(n, n') + h(n')

Consistency implies admissibility, so if your heuristic is consistent, you're guaranteed optimality. Manhattan distance and Euclidean distance are both consistent for grid pathfinding.

If the heuristic is not admissible (it overestimates), A becomes Weighted A or a greedy search — it might find a path faster but it won't be optimal. This is sometimes desirable in games where a "good enough" path found quickly is better than the optimal path found slowly.

Comparison: A* vs Dijkstra vs BFS

Feature	BFS	Dijkstra	A*
Handles weights?	No	Yes	Yes
Uses heuristic?	No	No	Yes
Optimal?	Yes (unweighted)	Yes	Yes (admissible h)
Time complexity	O(V + E)	O((V + E) log V)	O((V + E) log V)
Nodes explored	Many	Many	Few
Best for	Unweighted graphs	Weighted, no goal info	Weighted, known goal

Use BFS when the graph is unweighted and you want the shortest path. Use Dijkstra when the graph has weights but you don't have a useful heuristic, or you need shortest paths to all nodes (not just one goal). Use A* when you have a weighted graph, a specific goal, and a good heuristic. This is most pathfinding scenarios: grid maps, road networks, game worlds.

Weighted A* and Variations

Sometimes you want A to run faster at the cost of optimality. Weighted A multiplies the heuristic by a factor w > 1:

f_score = tentative_g + w * heuristic(neighbor, goal)

With w = 1, you get standard A*. With w = 2, the algorithm is more aggressive toward the goal — it explores fewer nodes but the path might be up to 2x the optimal length. For real-time games where frame time matters, w = 1.5 or w = 2 is a common trade-off.

Other variations:

IDA (Iterative Deepening A) — uses DFS with an increasing f-score threshold. Uses O(depth) memory instead of O(nodes) but re-explores nodes. Useful when memory is limited.

D (Dynamic A) — replans efficiently when the graph changes. Used in robotics where obstacles appear or disappear.

Jump Point Search — optimized A for uniform-cost grids. Exploits grid symmetry to skip large regions, often 10-20x faster than standard A on grid maps.

Real-World Applications

Game AI pathfinding. Every modern game that has characters moving around a map uses A or a variant. Unity, Unreal Engine, and Godot all implement A-based navigation systems. The navmesh (navigation mesh) converts complex 3D environments into graphs that A* can search. GPS and routing. Google Maps, Apple Maps, and Waze use algorithms closely related to A (often ALT or Contraction Hierarchies, which are A with preprocessing) to find routes in road networks with millions of nodes. Robotics. Autonomous robots use A (often D) for path planning in both known and unknown environments. The heuristic is based on straight-line distance to the goal. Puzzle solving. The 8-puzzle, 15-puzzle, and Rubik's Cube can be modeled as graph search problems where A* with appropriate heuristics finds optimal solutions.

Common Mistakes

Using an inadmissible heuristic and expecting optimal results. If your heuristic overestimates, A* won't find the shortest path. Double-check that your heuristic never returns a value higher than the actual remaining cost. Not handling ties properly. When two nodes have the same f score, the tie-breaking strategy affects performance. Breaking ties in favor of higher g values (closer to the goal) tends to perform better. Forgetting diagonal cost. If your grid allows diagonal movement, diagonal steps should cost sqrt(2), not 1. Using Manhattan distance as the heuristic for 8-directional movement overestimates and breaks admissibility. Not using the right data structure. A* requires an efficient priority queue. Use a binary heap (Python's heapq) at minimum. For very large graphs, a Fibonacci heap gives better theoretical performance for decrease-key operations, though in practice binary heaps are usually faster due to cache locality.

What's Next

A is one of those algorithms that rewards deep understanding. Once you've implemented basic A on grids, try it on:

Weighted graphs (road networks)

Games with varying terrain costs

Puzzle problems (8-puzzle with Manhattan distance heuristic)

Dynamic environments where the graph changes

Then explore Jump Point Search for grid optimization, and Contraction Hierarchies for massive road networks. These build on A*'s foundation and handle real-world scale.

For more algorithm tutorials, data structure guides, and coding problem walkthroughs, check out CodeUp.