Database Migrations — Manage Schema Changes Without Breaking Production
Cover why migrations matter, migration tools like Prisma Migrate, Alembic, Knex, and Flyway, writing safe migrations, rollback strategies, and zero-downtime migration techniques.
At some point every developer runs ALTER TABLE directly on a production database. Once. After the outage, they learn about migrations.
Database migrations are version-controlled files that describe schema changes. They run in order, they can be rolled back, and they work the same way in development, staging, and production. They're the git of database schemas.
Why Migrations Matter
Without migrations, schema changes happen through:
- Someone SSH-ing into production and running raw SQL
- A README with "run these queries before deploying"
- A shared SQL file that nobody remembers to update
- The senior dev who's the only one who knows the current schema
All of these break down at scale. Migrations fix this by making schema changes:
- Reproducible: Same migration runs identically everywhere
- Ordered: Migration 003 always runs after 002
- Reversible: Rollbacks are defined alongside the change
- Auditable: Git history shows who changed what and when
Migration Tools by Ecosystem
| Tool | Language | Database Support | Approach |
|---|---|---|---|
| Prisma Migrate | TypeScript/JS | PostgreSQL, MySQL, SQLite, MongoDB | Schema-diff based |
| Alembic | Python | Any SQLAlchemy-supported DB | Script-based |
| Knex.js | JavaScript | PostgreSQL, MySQL, SQLite, MSSQL | Script-based |
| Flyway | Java (CLI for any) | 20+ databases | SQL or Java-based |
| golang-migrate | Go | PostgreSQL, MySQL, and more | SQL-based |
| Rails ActiveRecord | Ruby | PostgreSQL, MySQL, SQLite | DSL-based |
| Django Migrations | Python | PostgreSQL, MySQL, SQLite, Oracle | Auto-generated |
Prisma Migrate (TypeScript/Node.js)
Prisma takes a schema-first approach. You define your schema, Prisma generates the SQL migration.
Define Schema
// prisma/schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
}
generator client {
provider = "prisma-client-js"
}
model User {
id Int @id @default(autoincrement())
email String @unique
name String
posts Post[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
model Post {
id Int @id @default(autoincrement())
title String
content String?
published Boolean @default(false)
author User @relation(fields: [authorId], references: [id])
authorId Int
createdAt DateTime @default(now())
}
Create and Apply Migrations
# Generate migration from schema changes
npx prisma migrate dev --name add_user_posts
# This creates:
# prisma/migrations/20260326120000_add_user_posts/migration.sql
# Apply to production (no interactive prompts)
npx prisma migrate deploy
Generated SQL
-- prisma/migrations/20260326120000_add_user_posts/migration.sql
CREATE TABLE "User" (
"id" SERIAL NOT NULL,
"email" TEXT NOT NULL,
"name" TEXT NOT NULL,
"createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updatedAt" TIMESTAMP(3) NOT NULL,
CONSTRAINT "User_pkey" PRIMARY KEY ("id")
);
CREATE TABLE "Post" (
"id" SERIAL NOT NULL,
"title" TEXT NOT NULL,
"content" TEXT,
"published" BOOLEAN NOT NULL DEFAULT false,
"authorId" INTEGER NOT NULL,
"createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT "Post_pkey" PRIMARY KEY ("id")
);
CREATE UNIQUE INDEX "User_email_key" ON "User"("email");
ALTER TABLE "Post" ADD CONSTRAINT "Post_authorId_fkey"
FOREIGN KEY ("authorId") REFERENCES "User"("id") ON DELETE RESTRICT ON UPDATE CASCADE;
Adding a Column Later
Change the schema:
model User {
id Int @id @default(autoincrement())
email String @unique
name String
bio String? // NEW -- nullable so existing rows aren't broken
posts Post[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
npx prisma migrate dev --name add_user_bio
Generated migration:
ALTER TABLE "User" ADD COLUMN "bio" TEXT;
Alembic (Python/SQLAlchemy)
Alembic is the standard for Python. It generates migration scripts you can customize.
Setup
pip install alembic sqlalchemy
alembic init alembic
Configuration
# alembic/env.py
from models import Base # Your SQLAlchemy models
target_metadata = Base.metadata
Define Models
# models.py
from sqlalchemy import Column, Integer, String, DateTime, ForeignKey, Boolean, Text
from sqlalchemy.orm import relationship, DeclarativeBase
from datetime import datetime
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
email = Column(String(255), unique=True, nullable=False)
name = Column(String(255), nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
posts = relationship("Post", back_populates="author")
class Post(Base):
__tablename__ = "posts"
id = Column(Integer, primary_key=True)
title = Column(String(500), nullable=False)
content = Column(Text)
published = Column(Boolean, default=False)
author_id = Column(Integer, ForeignKey("users.id"), nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
author = relationship("User", back_populates="posts")
Generate and Apply Migrations
# Auto-generate from model changes
alembic revision --autogenerate -m "add user and post tables"
# Apply migrations
alembic upgrade head
# Rollback one step
alembic downgrade -1
# See current version
alembic current
# See migration history
alembic history
Generated Migration
# alembic/versions/abc123_add_user_and_post_tables.py
"""add user and post tables"""
revision = "abc123"
down_revision = None
from alembic import op
import sqlalchemy as sa
def upgrade():
op.create_table(
"users",
sa.Column("id", sa.Integer(), primary_key=True),
sa.Column("email", sa.String(255), nullable=False),
sa.Column("name", sa.String(255), nullable=False),
sa.Column("created_at", sa.DateTime()),
)
op.create_index("ix_users_email", "users", ["email"], unique=True)
op.create_table(
"posts",
sa.Column("id", sa.Integer(), primary_key=True),
sa.Column("title", sa.String(500), nullable=False),
sa.Column("content", sa.Text()),
sa.Column("published", sa.Boolean(), server_default="false"),
sa.Column("author_id", sa.Integer(), sa.ForeignKey("users.id"), nullable=False),
sa.Column("created_at", sa.DateTime()),
)
def downgrade():
op.drop_table("posts")
op.drop_table("users")
Knex.js Migrations
Knex uses a JavaScript-based approach with explicit up/down functions.
npx knex migrate:make add_users_table
// migrations/20260326_add_users_table.js
exports.up = function (knex) {
return knex.schema
.createTable("users", (table) => {
table.increments("id").primary();
table.string("email", 255).notNullable().unique();
table.string("name", 255).notNullable();
table.text("bio");
table.timestamps(true, true); // created_at, updated_at
})
.createTable("posts", (table) => {
table.increments("id").primary();
table.string("title", 500).notNullable();
table.text("content");
table.boolean("published").defaultTo(false);
table.integer("author_id").unsigned().notNullable();
table.foreign("author_id").references("users.id").onDelete("CASCADE");
table.timestamp("created_at").defaultTo(knex.fn.now());
});
};
exports.down = function (knex) {
return knex.schema.dropTable("posts").dropTable("users");
};
# Run migrations
npx knex migrate:latest
# Rollback last batch
npx knex migrate:rollback
Writing Safe Migrations
Not all migrations are safe to run on a live database. Some lock tables, some cause downtime, some corrupt data.
Safe Operations (Usually No Downtime)
| Operation | Why It's Safe |
|---|---|
| Add nullable column | No default needed, existing rows unaffected |
| Add index CONCURRENTLY (Postgres) | Doesn't lock the table |
| Create new table | No existing data affected |
| Add column with default (Postgres 11+) | Metadata-only change |
Dangerous Operations
| Operation | Risk | Mitigation |
|---|---|---|
| Add NOT NULL column without default | Fails if table has data | Add nullable first, backfill, then set NOT NULL |
| Drop column | Application code may still reference it | Deploy code change first, drop column later |
| Rename column | Breaks all queries using old name | Add new column, migrate data, drop old |
| Add index (non-concurrent) | Locks table for writes | Use CREATE INDEX CONCURRENTLY |
| Change column type | May fail or lose data | Add new column, migrate, drop old |
| Drop table | Irreversible data loss | Rename first, drop after confirmation period |
The Expand-Contract Pattern
For any breaking schema change, use a multi-step approach:
Step 1 -- Expand: Add the new structure alongside the old one.-- Renaming "name" to "full_name"
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
UPDATE users SET full_name = name;
Step 2 -- Migrate: Update application code to write to both columns, read from new.
# Write to both during transition
user.name = data.full_name # Old column
user.full_name = data.full_name # New column
Step 3 -- Contract: After all code is deployed and reading from the new column, drop the old.
ALTER TABLE users DROP COLUMN name;
This pattern ensures zero downtime because at no point is the application using a column that doesn't exist.
Rollback Strategies
Strategy 1: Down Migrations
Every migration has an up (apply) and down (rollback). This works for simple changes but breaks down for data migrations.
# Alembic -- easy to roll back
def upgrade():
op.add_column("users", sa.Column("bio", sa.Text()))
def downgrade():
op.drop_column("users", "bio")
But what about this?
def upgrade():
op.add_column("users", sa.Column("full_name", sa.String(255)))
# Merge first_name + last_name into full_name
op.execute("UPDATE users SET full_name = first_name || ' ' || last_name")
op.drop_column("users", "first_name")
op.drop_column("users", "last_name")
def downgrade():
# How do you split "John Smith" back into first/last?
# What about "Mary Jane Watson"?
# This is lossy -- there's no reliable downgrade
pass
Strategy 2: Forward-Only Migrations
Some teams don't write down migrations at all. If something goes wrong, they write a new migration to fix it. This is more realistic for complex data transformations where rollback is impossible or risky.
Strategy 3: Database Snapshots
Before running migrations on production, take a snapshot/backup. If the migration causes problems, restore the snapshot. This is the nuclear option but it's the most reliable.
# Postgres
pg_dump -Fc mydb > pre_migration_backup.dump
# Restore if needed
pg_restore -d mydb pre_migration_backup.dump
Migration Workflow in CI/CD
# GitHub Actions example
deploy:
steps:
- name: Run migrations
run: npx prisma migrate deploy
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
- name: Deploy application
run: ./deploy.sh
Run migrations before deploying new code if the new code depends on new schema. Run migrations after deploying new code if the old schema still works with the new code. The expand-contract pattern makes this ordering less critical.
Common Pitfalls
1. Not testing migrations on a copy of production data. A migration that works on your 100-row dev database might lock a 10-million-row production table for 20 minutes. 2. Long-running migrations without progress tracking. If a data migration processes millions of rows, do it in batches:def upgrade():
conn = op.get_bind()
batch_size = 1000
while True:
result = conn.execute(
text("""
UPDATE users SET full_name = first_name || ' ' || last_name
WHERE full_name IS NULL
LIMIT :batch_size
"""),
{"batch_size": batch_size},
)
if result.rowcount == 0:
break
3. Forgetting to add indexes. Every foreign key should have an index. Every column you filter or sort by in queries should have an index. Missing indexes don't break anything during development but cause catastrophic performance in production.
4. Running migrations as the app starts. Don't embed migration execution in your application boot sequence. Run migrations as a separate step in your deployment pipeline. If the migration fails, you don't want it retried on every pod restart in a Kubernetes deployment.
5. Editing existing migration files. Once a migration has been applied anywhere (staging, production, a teammate's laptop), it's immutable. Create a new migration to fix issues.
The CodeUp tutorials at codeup.dev cover database design fundamentals and query optimization, which pair well with these migration patterns. Getting the schema right upfront means fewer painful migrations later.