March 27, 20268 min read

SQL Complete Guide: From SELECT to Window Functions

Everything you need to know about SQL in one place — basics, joins, subqueries, CTEs, window functions, indexing, and performance. With PostgreSQL examples throughout.

sql postgresql database guide backend

SQL has been around since the 1970s. It'll be around long after whatever JavaScript framework you're using this week is forgotten. Every serious application stores data in a relational database at some point, and SQL is how you talk to it. This guide covers everything from basic queries to the advanced stuff that separates junior developers from senior ones.

All examples use PostgreSQL, but 95% of what's here works in MySQL, SQLite, and SQL Server with minor syntax differences.

The Basics — SELECT, WHERE, ORDER BY

-- Get everything from a table
SELECT * FROM users;

-- Get specific columns
SELECT name, email FROM users;

-- Filter rows
SELECT name, email FROM users WHERE active = true;

-- Sort results
SELECT name, email FROM users ORDER BY name ASC;

-- Limit results
SELECT name, email FROM users ORDER BY created_at DESC LIMIT 10;

Nothing surprising here. But let's be honest — most developers never get far beyond this. That's a problem because SQL can do so much more.

Filtering — Beyond Basic WHERE

-- Multiple conditions
SELECT * FROM orders WHERE status = 'shipped' AND total > 100;

-- IN — match any value in a list
SELECT * FROM users WHERE role IN ('admin', 'moderator');

-- BETWEEN — range queries (inclusive)
SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31';

-- LIKE — pattern matching
SELECT * FROM users WHERE email LIKE '%@gmail.com';

-- IS NULL — check for missing values (never use = NULL)
SELECT * FROM users WHERE deleted_at IS NULL;

The IS NULL point trips people up more than you'd expect. WHERE status = NULL returns nothing — always. NULL isn't a value; it's the absence of a value. Use IS NULL and IS NOT NULL.

Aggregation — GROUP BY and HAVING

-- Count users by role
SELECT role, COUNT(*) as user_count
FROM users
GROUP BY role;

-- Average order total by month
SELECT
  DATE_TRUNC('month', created_at) as month,
  AVG(total) as avg_total,
  COUNT(*) as order_count
FROM orders
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;

-- HAVING — filter groups (WHERE filters rows, HAVING filters groups)
SELECT role, COUNT(*) as user_count
FROM users
GROUP BY role
HAVING COUNT(*) > 10;

The difference between WHERE and HAVING confuses a lot of people. Think of it this way: WHERE filters before grouping, HAVING filters after. You can't use aggregate functions in WHERE because the groups don't exist yet.

Joins — How Tables Talk to Each Other

This is where SQL gets powerful. Most real queries involve multiple tables.

-- INNER JOIN — only matching rows from both tables
SELECT users.name, orders.total
FROM users
INNER JOIN orders ON users.id = orders.user_id;

-- LEFT JOIN — all rows from left table, matching rows from right (or NULL)
SELECT users.name, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
GROUP BY users.name;

-- RIGHT JOIN — opposite of LEFT JOIN (rarely used — just swap the table order)

-- FULL OUTER JOIN — all rows from both tables
SELECT u.name, o.total
FROM users u
FULL OUTER JOIN orders o ON u.id = o.user_id;

-- Self-join — join a table to itself (e.g., employees and their managers)
SELECT e.name as employee, m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

In my experience, INNER JOIN and LEFT JOIN handle 95% of real-world cases. If you're reaching for FULL OUTER JOIN often, your schema might need rethinking.

Subqueries — Queries Inside Queries

-- Scalar subquery — returns a single value
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

-- Table subquery — returns rows
SELECT * FROM users
WHERE id IN (
  SELECT user_id FROM orders WHERE total > 500
);

-- Correlated subquery — references the outer query (slower, but sometimes necessary)
SELECT name, (
  SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id
) as order_count
FROM users;

Subqueries work, but they can be hard to read. For complex queries, CTEs are almost always better.

CTEs — Making Complex Queries Readable

Common Table Expressions (WITH clauses) let you name intermediate results. They're the single best tool for writing SQL that other humans can understand.

-- Without CTE — nested and hard to follow
SELECT * FROM (
  SELECT user_id, SUM(total) as lifetime_value
  FROM orders GROUP BY user_id
) sub WHERE lifetime_value > 1000;

-- With CTE — clear and readable
WITH customer_value AS (
  SELECT user_id, SUM(total) as lifetime_value
  FROM orders
  GROUP BY user_id
)
SELECT u.name, cv.lifetime_value
FROM customer_value cv
JOIN users u ON u.id = cv.user_id
WHERE cv.lifetime_value > 1000
ORDER BY cv.lifetime_value DESC;

You can chain multiple CTEs:

WITH
  monthly_revenue AS (
    SELECT DATE_TRUNC('month', created_at) as month, SUM(total) as revenue
    FROM orders GROUP BY 1
  ),
  revenue_growth AS (
    SELECT month, revenue,
      LAG(revenue) OVER (ORDER BY month) as prev_revenue
    FROM monthly_revenue
  )
SELECT month, revenue, prev_revenue,
  ROUND((revenue - prev_revenue) / prev_revenue * 100, 1) as growth_pct
FROM revenue_growth
WHERE prev_revenue IS NOT NULL;

Window Functions — The Most Underused SQL Feature

Window functions perform calculations across a set of rows related to the current row. They're incredibly powerful and most developers don't know they exist.

-- ROW_NUMBER — assign sequential numbers
SELECT name, department, salary,
  ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rank
FROM employees;

-- RANK and DENSE_RANK — handle ties differently
-- RANK: 1, 2, 2, 4 (skips 3)
-- DENSE_RANK: 1, 2, 2, 3 (no gaps)

-- LAG and LEAD — access previous/next rows
SELECT month, revenue,
  LAG(revenue, 1) OVER (ORDER BY month) as prev_month,
  revenue - LAG(revenue, 1) OVER (ORDER BY month) as change
FROM monthly_revenue;

-- Running total
SELECT date, amount,
  SUM(amount) OVER (ORDER BY date) as running_total
FROM transactions;

-- Moving average (last 7 days)
SELECT date, amount,
  AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as avg_7d
FROM daily_sales;

Here's the thing — any time you catch yourself writing application code to loop through query results and compute running totals, rankings, or comparisons to previous rows, you should probably be using a window function instead. The database does it faster and the code is cleaner.

Indexing — Why Your Queries Are Slow

Indexes are how databases avoid scanning every row in a table. Without an index, WHERE user_id = 42 checks every single row. With an index, it jumps straight to the matching rows.

-- Create an index
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Composite index — covers queries filtering on both columns
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Unique index — enforces uniqueness
CREATE UNIQUE INDEX idx_users_email ON users(email);

When to index: columns in WHERE clauses, JOIN conditions, and ORDER BY clauses that appear in frequent queries. When not to index: small tables (full scan is fast anyway), columns with very low cardinality (boolean columns), tables with heavy write workloads (every INSERT/UPDATE must also update the index). Column order in composite indexes matters. An index on (user_id, status) helps queries filtering on user_id alone, but NOT queries filtering on status alone. Think of it like a phone book — sorted by last name, then first name. You can look up all Smiths quickly, but finding everyone named "John" across all last names requires a full scan.

Performance — EXPLAIN Is Your Best Friend

-- See the query execution plan
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.active = true
GROUP BY u.name;

Look for: Seq Scan on large tables (probably needs an index), high actual time values, and mismatches between rows estimates and actual rows (stale statistics — run ANALYZE).

The N+1 problem: Your application fetches 100 users, then runs 100 individual queries to get each user's orders. Instead, fetch all orders for those users in one query with a JOIN or WHERE IN.

SQL for Interviews — Top Patterns

If you're preparing for technical interviews, these patterns come up constantly:

Second highest salary: SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees) — or use DENSE_RANK().
Duplicate detection: SELECT email, COUNT() FROM users GROUP BY email HAVING COUNT() > 1
Running totals: Window function with SUM() OVER (ORDER BY ...)
Top N per group: ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) then filter where rank <= N.
Date gaps: Use LAG() to compare consecutive dates.
Pivot/unpivot: Use CASE WHEN with aggregation or CROSSTAB.

Where to Go From Here

SQL rewards depth. The basics take an afternoon. Joins and aggregation take a week. Window functions and performance tuning take months of practice. But every bit of SQL knowledge pays dividends forever — the syntax barely changes across decades and database engines.

Practice writing real queries on CodeUp — interactive SQL exercises where you write queries against real datasets and see results immediately. The patterns stick much faster when you're solving actual problems instead of reading about them.