March 27, 20267 min read

Design a URL Shortener: System Design Interview Walkthrough

Complete system design walkthrough for a URL shortener — requirements, estimation, API design, key generation, database schema, caching, and scaling strategies.

system-design url-shortener interviews architecture backend

The URL shortener is the "Two Sum" of system design interviews — it's the problem everyone starts with, and the one most people think they understand until they're asked to go deeper. It seems trivial on the surface (just map short codes to long URLs, right?) but it touches on key generation, database design, caching, analytics, and scaling decisions that reveal how you think.

Let's walk through this the way you'd actually do it in an interview. Not a memorized answer — a conversation.

Step 1: Requirements Clarification

Before drawing a single box, ask questions. Here's what you'd establish:

Functional requirements:

Given a long URL, generate a short URL
When users access the short URL, redirect to the original
Support custom aliases (optional)
URLs can have an expiration time (optional)
Analytics — how many times was a link clicked (optional, but good to mention)

Non-functional requirements:

Redirects must be fast (sub-50ms latency)
High availability — if the shortener is down, every shortened link on the internet is broken
Short URLs should be as short as possible
Not guessable (if we care about private links)

Scale assumptions:

100M new URLs created per day
10:1 read-to-write ratio (1B redirects per day)
URLs stored for 5 years by default

Step 2: Back-of-the-Envelope Estimation

Write QPS: 100M / 86,400 seconds ≈ 1,200 writes/sec. Peak: ~3,600/sec. Read QPS: 1B / 86,400 ≈ 12,000 reads/sec. Peak: ~36,000/sec. Storage: Each URL entry needs roughly 500 bytes (short code, long URL, metadata). 100M/day × 500 bytes = 50 GB/day. Over 5 years: ~90 TB. Cache memory: If 20% of URLs generate 80% of traffic (power law), caching the top 20% of daily URLs: 0.2 × 100M × 500 bytes = 10 GB. Fits comfortably in a single Redis instance. Bandwidth: 12,000 QPS × 500 bytes ≈ 6 MB/s. Not a concern.

These numbers tell us: this is a read-heavy system with moderate storage needs. Caching will be extremely effective. We don't need an exotic database — this is solvable with well-known tools.

Step 3: API Design

Keep it simple:

POST /api/v1/shorten
Body: { "longUrl": "https://example.com/very/long/path", "customAlias": "my-link", "expiresAt": "2027-01-01" }
Response: { "shortUrl": "https://short.ly/abc123", "expiresAt": "2027-01-01" }

GET /{shortCode}
Response: HTTP 301/302 redirect to original URL

GET /api/v1/stats/{shortCode}
Response: { "totalClicks": 15234, "createdAt": "2026-03-27", "originalUrl": "..." }

301 vs 302 redirect — this comes up in every interview:

301 (Moved Permanently) — browser caches the redirect. Subsequent requests go directly to the long URL, bypassing your server. Reduces server load but you lose analytics.
302 (Found / Temporary) — browser doesn't cache. Every request hits your server. More load but you can track every click.

For a URL shortener with analytics, use 302. If analytics doesn't matter and you want to minimize load, use 301.

Step 4: Database Schema

CREATE TABLE urls (
    id          BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code  VARCHAR(10) UNIQUE NOT NULL,
    original_url TEXT NOT NULL,
    user_id     BIGINT,
    created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at  TIMESTAMP,
    click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at);

The short_code is what we look up on every redirect. It needs a unique index. The expires_at index helps with cleanup of expired URLs.

SQL or NoSQL? Either works here. The access pattern is simple key-value lookups (short_code → original_url). A key-value store like DynamoDB would be efficient. But SQL works fine too — at 12K QPS with proper indexing and caching, PostgreSQL handles this comfortably.

Step 5: Key Generation — The Interesting Part

This is where interviewers dig deep. How do you generate short, unique codes?

Approach 1: Base62 encoding of auto-increment ID

CHARSET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encode_base62(num):
    if num == 0:
        return CHARSET[0]
    result = []
    while num > 0:
        result.append(CHARSET[num % 62])
        num //= 62
    return ''.join(reversed(result))

# ID 1000000 → encode_base62(1000000) → "4c92"

Pros: No collisions, short codes, simple. Codes get longer as IDs grow but 7 characters of base62 gives 62^7 = 3.5 trillion unique codes — plenty.

Cons: Codes are sequential and predictable. Someone can guess what URL ID 1000001 maps to. Also requires a centralized ID generator (single point of failure).

Approach 2: Pre-generated key service (KGS)

A separate service pre-generates millions of random unique keys and stores them. When a URL needs to be shortened, the app server grabs a key from the KGS.

KGS Database:

key used

"xK9mQ" false
"bR3nT" false
"wJ7pL" true

Pros: No collision handling, no coordination needed between app servers (each server grabs a batch of keys). Keys are random and non-guessable.

Cons: Extra service to maintain. Need to handle the case where KGS runs out of keys (it won't if you generate enough, but mention it).

Approach 3: Hash-based (MD5/SHA256)

Hash the long URL, take the first 7 characters of the base62-encoded hash.

Pros: Same long URL always generates the same short code (deduplication for free).

Cons: Hash collisions. Two different URLs could produce the same 7-character prefix. You need collision detection and retry logic. Also, adding a timestamp or user ID to the hash means no deduplication.

Which to recommend? For an interview, the KGS approach is often the strongest answer because it's simple, scalable, and doesn't have the single-point-of-failure problem of auto-increment IDs. But explain the tradeoffs of all three — that's what they're looking for.

Step 6: High-Level Architecture

                    ┌──────────────┐
                    │   Clients    │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │ Load Balancer│
                    └──────┬───────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────▼────┐ ┌────▼─────┐ ┌───▼──────┐
        │App Server│ │App Server│ │App Server│
        └─────┬────┘ └────┬─────┘ └───┬──────┘
              │            │            │
              └────────────┼────────────┘
                           │
              ┌────────────┼────────────┐
              │                         │
        ┌─────▼────┐            ┌──────▼──────┐
        │  Redis   │            │  Database   │
        │  Cache   │            │ (Primary +  │
        └──────────┘            │  Replicas)  │
                                └──────┬──────┘
                                       │
                                ┌──────▼──────┐
                                │    Kafka    │
                                │ (Analytics) │
                                └─────────────┘

Redirect flow (the hot path):

User hits GET /abc123
App server checks Redis cache for abc123
Cache hit → return 302 redirect immediately
Cache miss → query database, populate cache, return 302
Asynchronously log the click event to Kafka

Shorten flow:

POST /shorten with long URL
App server requests a key from the Key Generation Service
Store (short_code, original_url) in database
Return short URL to client

Step 7: Scaling Strategies

Database sharding: Shard by hash of short_code. Each shard handles a subset of the keyspace. Since every redirect lookup is by short_code, sharding is straightforward — no cross-shard queries needed. Read replicas: Since the read-to-write ratio is 10:1, add read replicas. Writes go to the primary, reads go to replicas. Slight replication lag is acceptable — a newly created URL being unavailable for 1-2 seconds is fine. CDN for redirects: For the most popular URLs, a CDN can cache the redirect response at edge locations. This is aggressive but effective at massive scale. Cache warming: Pre-populate the cache with trending URLs. If a URL goes viral, the first few thousand requests shouldn't all hit the database.

Step 8: Analytics Pipeline

Click analytics should never slow down the redirect. The redirect returns immediately; analytics processing happens asynchronously.

Click event → Kafka topic → Stream processor (Flink/Spark) → Analytics database (ClickHouse/BigQuery)

Each click event contains: short_code, timestamp, user_agent, IP address, referrer. The stream processor aggregates this into per-URL, per-day, per-country statistics.

This is a good place to show the interviewer you understand async processing — analytics is the textbook use case for message queues.

What Interviewers Want to Hear

The URL shortener tests whether you can:

Clarify requirements — not all shorteners need analytics or custom aliases
Estimate scale — know your numbers, even roughly
Make database decisions — and justify them
Handle the key generation deep dive — this is where the real conversation happens
Think about caching — the read-heavy pattern screams for a cache layer
Consider edge cases — URL expiration, duplicate URLs, rate limiting on creation

The worst thing you can do is jump to the architecture diagram without establishing requirements. The second worst thing is presenting one approach to key generation without discussing alternatives.

Build your system design intuition by working through problems hands-on at CodeUp — there's no substitute for practicing the conversation flow, not just memorizing the architecture.