March 26, 202612 min read

PostgreSQL: The Database You Should Probably Be Using

A practical PostgreSQL tutorial covering installation, CRUD, indexes, joins, JSON support, full-text search, and performance tips for developers.

postgresql database sql backend tutorial

If you're building a web application and need a database, PostgreSQL is almost certainly the right choice. It's free, battle-tested at massive scale (Instagram, Discord, Notion all use it), standards-compliant, and has features that other databases charge enterprise prices for -- JSON support, full-text search, advanced indexing, and extensions that let it handle almost anything.

MySQL is fine. SQLite is great for small projects. But Postgres is the database that grows with you, and this guide will get you productive with it fast.

Why PostgreSQL

Let me be direct about why Postgres wins for most projects:

Standards-compliant SQL: If you learn Postgres SQL, you know SQL. Other databases have quirks (MySQL's GROUP BY behavior, for instance) that Postgres handles correctly.
Data integrity: Strong type system, proper constraint enforcement, ACID transactions that actually work.
JSON support: Store and query JSON natively with jsonb. This gives you document-database flexibility without giving up relational integrity.
Full-text search: Built-in, no Elasticsearch needed for most use cases.
Extensions: PostGIS for geospatial, pg_trgm for fuzzy matching, pgvector for AI embeddings.
Free forever: No license traps, no "community edition" limitations. The full feature set is always free.

The tradeoff? Postgres requires a running server (unlike SQLite) and has more setup than a hosted Firebase. But for anything beyond a toy project, this is the right investment.

Installation

macOS

brew install postgresql@16
brew services start postgresql@16

Ubuntu/Debian

sudo apt update
sudo apt install postgresql postgresql-contrib
sudo systemctl start postgresql

Windows

Download the installer from postgresql.org/download. Run it, set a password for the postgres superuser, and accept the defaults.

Docker (Recommended for Development)

docker run --name pg-dev \
  -e POSTGRES_PASSWORD=devpass \
  -p 5432:5432 \
  -d postgres:16

This gives you a clean, isolated Postgres instance. When you're done, docker stop pg-dev. Want a fresh start? docker rm pg-dev and run the command again.

Verify Installation

psql -U postgres

You should see the Postgres prompt: postgres=#. If you're using Docker:

docker exec -it pg-dev psql -U postgres

Creating Databases and Tables

-- Create a database
CREATE DATABASE myapp;

-- Connect to it
\c myapp

-- Create a table
CREATE TABLE users (
    id          SERIAL PRIMARY KEY,
    email       VARCHAR(255) UNIQUE NOT NULL,
    name        VARCHAR(100) NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW(),
    is_active   BOOLEAN DEFAULT TRUE
);

CREATE TABLE posts (
    id          SERIAL PRIMARY KEY,
    user_id     INTEGER REFERENCES users(id) ON DELETE CASCADE,
    title       VARCHAR(300) NOT NULL,
    body        TEXT NOT NULL,
    published   BOOLEAN DEFAULT FALSE,
    created_at  TIMESTAMPTZ DEFAULT NOW(),
    updated_at  TIMESTAMPTZ DEFAULT NOW()
);

Key points:

SERIAL auto-increments (Postgres also supports GENERATED ALWAYS AS IDENTITY for newer syntax)

REFERENCES creates a foreign key -- database-enforced relationships

ON DELETE CASCADE means deleting a user deletes all their posts

TIMESTAMPTZ stores timezone-aware timestamps -- always use this over TIMESTAMP

VARCHAR(n) for bounded strings, TEXT for unbounded

Useful psql Commands

\l          -- List databases
\c dbname   -- Connect to a database
\dt         -- List tables
\d tablename -- Describe a table (columns, types, constraints)
\di         -- List indexes
\q          -- Quit

CRUD Operations

Create (INSERT)

INSERT INTO users (email, name)
VALUES ('alice@example.com', 'Alice Johnson');

INSERT INTO users (email, name)
VALUES ('bob@example.com', 'Bob Smith');

-- Insert and return the new row
INSERT INTO users (email, name)
VALUES ('charlie@example.com', 'Charlie Brown')
RETURNING id, email, created_at;

-- Insert multiple rows
INSERT INTO posts (user_id, title, body, published)
VALUES
    (1, 'Hello World', 'My first blog post.', TRUE),
    (1, 'Draft Post', 'Still working on this.', FALSE),
    (2, 'PostgreSQL Tips', 'Postgres is great.', TRUE);

RETURNING is a Postgres feature that gives you back the inserted data. Extremely useful -- no need for a separate SELECT after inserting.

Read (SELECT)

-- All users
SELECT * FROM users;

-- Specific columns
SELECT name, email FROM users;

-- With a condition
SELECT * FROM users WHERE is_active = TRUE;

-- Pattern matching
SELECT * FROM users WHERE email LIKE '%@example.com';

-- Ordering
SELECT * FROM posts ORDER BY created_at DESC;

-- Limiting results
SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;

-- Counting
SELECT COUNT(*) FROM posts WHERE published = TRUE;

Update

UPDATE users
SET name = 'Alice J.'
WHERE email = 'alice@example.com'
RETURNING *;

-- Update multiple columns
UPDATE posts
SET published = TRUE, updated_at = NOW()
WHERE id = 2;

Always include a WHERE clause. An UPDATE without WHERE modifies every row in the table. Postgres won't stop you.

Delete

DELETE FROM posts WHERE id = 2;

-- Delete with a condition
DELETE FROM users
WHERE is_active = FALSE AND created_at < NOW() - INTERVAL '1 year';

Same warning: DELETE without WHERE deletes everything.

Data Types Worth Knowing

Postgres has a rich type system. Here are the ones you'll actually use:

Type	When to Use
`INTEGER` / `BIGINT`	Whole numbers. Use BIGINT for IDs in large tables.
`NUMERIC(p,s)`	Exact decimal (money, measurements). `NUMERIC(10,2)` for currency.
`TEXT`	Unbounded text. Use this instead of `VARCHAR` when there's no natural limit.
`BOOLEAN`	True/false.
`TIMESTAMPTZ`	Date + time + timezone. Always use this over `TIMESTAMP`.
`DATE`	Date only (no time).
`UUID`	Universally unique identifiers. Great for public-facing IDs.
`JSONB`	Binary JSON. Queryable, indexable. See the JSON section below.
`ARRAY`	`INTEGER[]`, `TEXT[]`. Postgres supports native arrays.
`INET`	IP addresses. Supports both IPv4 and IPv6 with network operations.

-- UUID primary key
CREATE TABLE sessions (
    id          UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id     INTEGER REFERENCES users(id),
    expires_at  TIMESTAMPTZ NOT NULL
);

-- Array column
CREATE TABLE articles (
    id      SERIAL PRIMARY KEY,
    title   TEXT NOT NULL,
    tags    TEXT[] DEFAULT '{}'
);

INSERT INTO articles (title, tags)
VALUES ('Postgres Guide', ARRAY['database', 'sql', 'tutorial']);

-- Query array
SELECT * FROM articles WHERE 'sql' = ANY(tags);

Indexes: Making Queries Fast

Without indexes, Postgres scans every row in a table to answer a query (sequential scan). Indexes create a shortcut -- like a book index lets you find a topic without reading every page.

-- B-tree index (default, most common)
CREATE INDEX idx_users_email ON users(email);

-- Unique index (also enforces uniqueness)
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- Multi-column index
CREATE INDEX idx_posts_user_published ON posts(user_id, published);

-- Partial index (only index a subset of rows)
CREATE INDEX idx_posts_published ON posts(created_at)
WHERE published = TRUE;

-- Expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

When to Create Indexes

Columns in WHERE clauses that filter large tables
Columns in JOIN conditions
Columns in ORDER BY on frequently-run queries
Foreign key columns (Postgres doesn't auto-index these)

When Not to Create Indexes

Tables with fewer than ~1,000 rows (sequential scan is fine)
Columns with very low cardinality (e.g., a boolean with 50/50 distribution)
Tables with heavy write traffic and infrequent reads (indexes slow down writes)

Checking if Your Index is Used

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';

EXPLAIN ANALYZE shows the actual query plan. Look for "Index Scan" (good) vs "Seq Scan" (might need an index).

Joins

Joins combine data from multiple tables. If you've used SQL before, this is review -- but Postgres handles all standard join types correctly.

-- INNER JOIN: only matching rows from both tables
SELECT u.name, p.title, p.created_at
FROM users u
INNER JOIN posts p ON u.id = p.user_id
WHERE p.published = TRUE
ORDER BY p.created_at DESC;

-- LEFT JOIN: all users, even those without posts
SELECT u.name, COUNT(p.id) AS post_count
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
GROUP BY u.name
ORDER BY post_count DESC;

-- Multiple joins
SELECT u.name, p.title, c.body AS comment
FROM users u
JOIN posts p ON u.id = p.user_id
JOIN comments c ON p.id = c.post_id
WHERE p.published = TRUE;

Quick reference:

INNER JOIN: rows that match in both tables

LEFT JOIN: all rows from left table, matched rows from right (NULL if no match)

RIGHT JOIN: opposite of LEFT JOIN (rarely used, just swap table order)

FULL OUTER JOIN: all rows from both tables (NULLs where no match)

CROSS JOIN: every combination of rows (cartesian product, usually a mistake)

JSON Support

Postgres has two JSON types: JSON (stores raw text) and JSONB (stores binary, parsed JSON). Always use JSONB -- it's faster for queries and supports indexing.

CREATE TABLE events (
    id          SERIAL PRIMARY KEY,
    event_type  VARCHAR(50) NOT NULL,
    payload     JSONB NOT NULL,
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

INSERT INTO events (event_type, payload) VALUES
('user_signup', '{"user_id": 1, "email": "alice@example.com", "source": "google"}'),
('purchase', '{"user_id": 1, "product": "Pro Plan", "amount": 29.99}'),
('page_view', '{"user_id": 2, "page": "/pricing", "referrer": "twitter.com"}');

Querying JSON

-- Extract a field (returns JSON)
SELECT payload->'email' FROM events;

-- Extract as text (returns text, usually what you want)
SELECT payload->>'email' FROM events WHERE event_type = 'user_signup';

-- Nested access
SELECT payload->'address'->>'city' FROM events;

-- Filter by JSON field
SELECT * FROM events
WHERE payload->>'user_id' = '1';

-- Check if a key exists
SELECT * FROM events
WHERE payload ? 'referrer';

-- Containment (does the JSON contain this sub-object?)
SELECT * FROM events
WHERE payload @> '{"source": "google"}';

Indexing JSONB

-- GIN index for general JSONB queries
CREATE INDEX idx_events_payload ON events USING GIN (payload);

-- This makes containment queries (@>) and key-exists (?) fast

JSONB gives you 80% of the flexibility of MongoDB while keeping all the benefits of a relational database. Use it for semi-structured data: user preferences, event metadata, API responses.

Full-Text Search

Postgres includes a full-text search engine. For many applications, this is all you need -- no Elasticsearch required.

-- Add a search column
ALTER TABLE posts ADD COLUMN search_vector TSVECTOR;

-- Populate it
UPDATE posts
SET search_vector = to_tsvector('english', title || ' ' || body);

-- Create an index
CREATE INDEX idx_posts_search ON posts USING GIN (search_vector);

-- Search
SELECT title, ts_rank(search_vector, query) AS rank
FROM posts, to_tsquery('english', 'postgresql & tips') AS query
WHERE search_vector @@ query
ORDER BY rank DESC;

Auto-Update with a Trigger

CREATE FUNCTION update_search_vector() RETURNS TRIGGER AS $$
BEGIN
    NEW.search_vector :=
        to_tsvector('english', COALESCE(NEW.title, '') || ' ' ||
                                COALESCE(NEW.body, ''));
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER posts_search_update
    BEFORE INSERT OR UPDATE ON posts
    FOR EACH ROW EXECUTE FUNCTION update_search_vector();

Now the search vector updates automatically whenever a post is created or modified.

What Postgres FTS Handles

Stemming ("running" matches "run")
Stop words (ignores "the", "is", "and")
Ranking by relevance
Phrase matching
Boolean operators (AND, OR, NOT)
Language-aware processing (English, Spanish, German, etc.)

What it doesn't handle as well: typo tolerance, fuzzy matching (use pg_trgm extension for this), and relevance tuning at scale. If you need those, consider Elasticsearch or Meilisearch.

Performance Tips

1. Use EXPLAIN ANALYZE

EXPLAIN ANALYZE
SELECT u.name, COUNT(p.id)
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
GROUP BY u.name;

Read the output bottom-up. Look for:

Sequential scans on large tables (might need an index)

High "actual rows" vs "estimated rows" (outdated statistics, run ANALYZE)

Nested loops on large result sets (might need a different join strategy)

2. Connection Pooling

Postgres connections are expensive (each one forks a process). Don't open a new connection per request. Use a connection pooler:

PgBouncer: lightweight, standalone pooler
Built into your ORM: SQLAlchemy, Prisma, and most ORMs handle pooling

# SQLAlchemy with connection pooling
from sqlalchemy import create_engine

engine = create_engine(
    "postgresql://user:pass@localhost/myapp",
    pool_size=20,
    max_overflow=10
)

3. Batch Operations

-- Instead of 1000 individual INSERTs:
INSERT INTO logs (level, message)
VALUES
    ('INFO', 'Server started'),
    ('WARNING', 'High memory usage'),
    ('ERROR', 'Connection timeout');
-- Batch inserts are 10-100x faster

4. Vacuum and Analyze

Postgres uses MVCC (Multi-Version Concurrency Control), which means deleted and updated rows leave behind "dead tuples." VACUUM cleans them up, and ANALYZE updates statistics for the query planner.

-- Usually handled automatically by autovacuum, but you can run manually:
VACUUM ANALYZE;

If your queries suddenly get slow, outdated statistics are a common cause. Run ANALYZE tablename.

5. Use Appropriate Types

Store IPs as INET, not TEXT -- Postgres can do network operations on them
Store UUIDs as UUID, not VARCHAR(36) -- 16 bytes vs 37 bytes, and proper comparison operators
Store money as NUMERIC(10,2), not FLOAT -- floating point has rounding errors

Essential psql Tricks

# Connect to a database
psql -U postgres -d myapp

# Run a SQL file
psql -U postgres -d myapp -f schema.sql

# Run a single command
psql -U postgres -d myapp -c "SELECT COUNT(*) FROM users;"

# Pretty output
\x auto          -- Expanded display for wide tables
\timing on       -- Show query execution time

Backup and Restore

# Backup a database
pg_dump -U postgres myapp > backup.sql

# Backup in compressed format
pg_dump -U postgres -Fc myapp > backup.dump

# Restore from SQL
psql -U postgres myapp < backup.sql

# Restore from compressed dump
pg_restore -U postgres -d myapp backup.dump

Always test your backups. An untested backup is not a backup.

Where to Go Next

Transactions: BEGIN, COMMIT, ROLLBACK -- essential for data consistency
CTEs and Window Functions: powerful query features for analytics
Materialized Views: precomputed query results for expensive aggregations
Row-Level Security: control data access at the database level
Logical Replication: replicate data to read replicas for scaling reads
Extensions: pg_trgm (fuzzy matching), PostGIS (geospatial), pgvector (AI embeddings)

Postgres is deep. You can spend years learning its features and still find new things. But the fundamentals in this guide will handle 90% of what you need to build production applications.

For more database tutorials, backend guides, and programming content, check out CodeUp.