SQL Interview Questions

TL;DR

30+ SQL interview questions with hidden answers, organized by topic. Click "Show Answer" to reveal. Perfect for a quick 15-minute revision before an interview.

Short on time? Focus on JOINs, Window Functions, and Scenario Questions — these come up in almost every interview.

SQL Basics

Q: What is the difference between SQL and NoSQL?

SQL databases are relational (tables, strict schema, ACID transactions). NoSQL databases are non-relational (documents, key-value, graphs — flexible schema, eventual consistency). SQL for structured data with relationships. NoSQL for unstructured data and horizontal scaling.

Q: What does "declarative" mean in SQL?

You describe what you want, not how to get it. The database optimizer figures out the best execution plan. This contrasts with imperative languages where you write step-by-step instructions.

Q: What is a primary key? What is a foreign key?

A primary key uniquely identifies each row (no duplicates, no NULLs). A foreign key references the primary key of another table, creating a relationship and enforcing referential integrity.

Q: What is the difference between UNIQUE and PRIMARY KEY?

Both enforce uniqueness. Differences: (1) a table can have only one PRIMARY KEY but many UNIQUE constraints, (2) PRIMARY KEY doesn't allow NULLs, UNIQUE allows one NULL (in most databases), (3) PRIMARY KEY creates a clustered index by default.

Q: What is normalization? Name the first 3 normal forms.

Normalization organizes data to reduce redundancy. 1NF: each cell holds a single value (no arrays). 2NF: 1NF + every non-key column depends on the entire primary key (no partial dependencies). 3NF: 2NF + no column depends on another non-key column (no transitive dependencies).

Q: What is the order of execution of a SQL query?

FROM/JOIN → WHERE → GROUP BY → HAVING → SELECT → DISTINCT → ORDER BY → LIMIT. This is why you can't use a SELECT alias in WHERE — WHERE runs first.

Want deeper coverage? See SQL Overview and Core Concepts.

SELECT, WHERE, GROUP BY

Q: What is the difference between WHERE and HAVING?

WHERE filters rows before grouping. HAVING filters groups after GROUP BY. You can't use aggregate functions in WHERE — use HAVING. Example: HAVING COUNT(*) > 5 works, WHERE COUNT(*) > 5 is an error.

Q: What is the difference between DISTINCT and GROUP BY?

Both can remove duplicates. DISTINCT filters the final result set. GROUP BY groups rows for aggregation. SELECT DISTINCT department FROM employees equals SELECT department FROM employees GROUP BY department. But GROUP BY lets you add aggregates: GROUP BY department HAVING COUNT(*) > 5.

Q: Write a query to find the 3rd highest salary without using LIMIT or window functions.

SELECT DISTINCT salary FROM employees e1
WHERE 3 = (
  SELECT COUNT(DISTINCT salary) FROM employees e2
  WHERE e2.salary >= e1.salary
);

This correlated subquery counts how many distinct salaries are ≥ each salary. The one with exactly 3 is the 3rd highest.

Q: What is the difference between IN and EXISTS? When is each faster?

Both check if a value is in a set. IN runs the subquery once and compares against the result list. EXISTS runs the correlated subquery for each row and stops at the first match. EXISTS is faster when the subquery returns many rows (short-circuits). IN is faster when the subquery result is small.

Q: How do NULL values behave in SQL comparisons?

NULL is not equal to anything — not even another NULL. NULL = NULL is false. NULL != 5 is also not true (it's unknown). Use IS NULL / IS NOT NULL to check. NULLs are ignored by aggregate functions (COUNT, SUM, AVG skip NULLs).

Deeper coverage: Core Concepts

JOINs

Q: What is the difference between INNER JOIN and LEFT JOIN?

INNER JOIN returns only rows with matches in both tables. LEFT JOIN returns all rows from the left table, filling NULLs for unmatched right-side columns.

Q: Write a query to find all departments with no employees.

-- Anti-join pattern
SELECT d.dept_name
FROM departments d
LEFT JOIN employees e ON d.id = e.dept_id
WHERE e.id IS NULL;

-- Alternative
SELECT dept_name FROM departments d
WHERE NOT EXISTS (
  SELECT 1 FROM employees e WHERE e.dept_id = d.id
);

Q: Can you JOIN a table to itself? Give an example.

Yes — a self-join. Common for hierarchies:
SELECT e.name AS employee, m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

Q: What is a cross join? When would you use one?

A CROSS JOIN returns every combination of rows (Cartesian product). 3 rows × 4 rows = 12 rows. Use for: generating all size×color combinations, date scaffolds, or test data. Rarely used in production queries.

Q: You have a slow query with multiple JOINs. How do you optimize it?

(1) Run EXPLAIN to see the query plan. (2) Index all JOIN columns. (3) Filter early with WHERE. (4) No SELECT *. (5) Check for implicit type conversions in JOIN conditions. (6) Consider if smaller subqueries can reduce intermediate result sets.

Deeper coverage: SQL JOINs Deep Dive

Window Functions

Q: What is the difference between ROW_NUMBER, RANK, and DENSE_RANK?

All assign a number per row. ROW_NUMBER: always unique (1, 2, 3, 4). RANK: ties share a rank, then skips (1, 2, 2, 4). DENSE_RANK: ties share, no gaps (1, 2, 2, 3).

Q: Write a query to find the 2nd highest salary in each department.

SELECT * FROM (
  SELECT name, department, salary,
    DENSE_RANK() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS dr
  FROM employees
) t WHERE dr = 2;

Use DENSE_RANK so ties don't skip the 2nd position.

Q: How would you calculate month-over-month revenue growth?

SELECT month, revenue,
  LAG(revenue) OVER (ORDER BY month) AS prev,
  ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month))
    / LAG(revenue) OVER (ORDER BY month), 1) AS growth_pct
FROM monthly_revenue;

Q: What is the difference between GROUP BY and a window function?

GROUP BY collapses rows — one output row per group, individual rows gone. Window functions keep all rows and add the calculation as a new column. Use GROUP BY for summary tables. Use window functions when you need detail AND aggregate side by side.

Q: Can you use a window function in a WHERE clause?

No. Window functions run at the SELECT stage, after WHERE. To filter by a window result, wrap in a subquery or CTE:
WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (...) AS rn
  FROM t
)
SELECT * FROM ranked WHERE rn = 1;

Q: How do you remove duplicate rows, keeping only the latest per user?

WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY user_id ORDER BY created_at DESC
  ) AS rn
  FROM events
)
SELECT * FROM ranked WHERE rn = 1;

ROW_NUMBER guarantees exactly one row per user, even if timestamps tie.

Deeper coverage: Window Functions Deep Dive

DDL, DML, DCL, TCL

Q: Classify: CREATE, SELECT, GRANT, COMMIT, ALTER, DELETE, REVOKE, ROLLBACK.

DDL: CREATE, ALTER • DML: SELECT, DELETE • DCL: GRANT, REVOKE • TCL: COMMIT, ROLLBACK

Q: What is the difference between DROP, TRUNCATE, and DELETE?

DROP (DDL): removes the entire table — structure, data, indexes, everything. TRUNCATE (DDL): removes all rows, keeps structure. Auto-commits. No WHERE. DELETE (DML): removes specific rows via WHERE. Can rollback. Fires triggers. Slower on big tables.

Q: You accidentally deleted all rows from production. The transaction is still open. What do you do?

ROLLBACK; — DELETE is DML, so it's transactional. If you had used TRUNCATE (DDL), it would have auto-committed and you'd need a backup restore.

Q: What does ACID stand for? Explain each property.

Atomicity: all or nothing. Consistency: data stays valid. Isolation: concurrent transactions don't interfere. Durability: committed data survives crashes.

Q: What happens if the database crashes mid-transaction?

Uncommitted transactions are rolled back on restart (Atomicity). Committed transactions survive via the write-ahead log (Durability).

Deeper coverage: DDL, DML, DCL, TCL Deep Dive

Indexes & Performance

Q: What is an index? Why not index every column?

An index is a data structure (usually B-tree) that speeds up lookups. The trade-off: indexes speed up reads but slow down writes (INSERT/UPDATE/DELETE must also update the index). Index columns you filter, join, or sort on frequently.

Q: Explain the difference between clustered and non-clustered indexes.

Clustered: determines physical row order on disk. One per table (usually the primary key). Non-clustered: separate structure with pointers to data rows. Many per table. Think: clustered = dictionary page order, non-clustered = index at the back of a textbook.

Q: What is a composite index? When is column order important?

A composite index covers multiple columns: CREATE INDEX idx ON t(a, b, c). Column order matters because the index follows the "leftmost prefix" rule. This index helps queries filtering on (a), (a,b), or (a,b,c) — but NOT (b) or (c) alone.

Q: What is EXPLAIN and how do you use it to optimize queries?

EXPLAIN SELECT ... shows the query plan: which indexes are used, join order, estimated row counts. Look for "Seq Scan" (full table scan) on large tables — that usually means a missing index. Also watch for high estimated row counts at early stages.

Q: A query with an indexed column in WHERE is still slow. What could be wrong?

Common causes: (1) function on the column defeats the index (WHERE YEAR(date) = 2024 — rewrite as range). (2) Implicit type conversion. (3) Low selectivity (index returns most of the table, so the optimizer chooses a scan). (4) Stale statistics — run ANALYZE. (5) The index doesn't match the query's filter order (composite index leftmost prefix rule).

Deeper coverage: Indexes in Core Concepts

Scenario & Coding Questions

Q: Write a query to find employees who earn more than their manager.

SELECT e.name AS employee, e.salary,
       m.name AS manager, m.salary AS manager_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;

Q: Write a query to find duplicate emails in a users table.

SELECT email, COUNT(*) AS cnt
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Q: Write a query to find the top 3 earners in each department.

SELECT * FROM (
  SELECT name, department, salary,
    ROW_NUMBER() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS rn
  FROM employees
) t WHERE rn <= 3;

Q: Write a query to find customers who placed orders in both 2024 and 2025.

SELECT customer_id
FROM orders
WHERE EXTRACT(YEAR FROM order_date) IN (2024, 2025)
GROUP BY customer_id
HAVING COUNT(DISTINCT EXTRACT(YEAR FROM order_date)) = 2;

Alternative with INTERSECT:

SELECT customer_id FROM orders WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
INTERSECT
SELECT customer_id FROM orders WHERE order_date >= '2025-01-01' AND order_date < '2026-01-01';

Q: Write a query to find the cumulative sum of daily revenue.

SELECT date, revenue,
  SUM(revenue) OVER (ORDER BY date) AS cumulative_revenue
FROM daily_revenue;

Q: How would you swap values of two columns for all rows without a temporary table?

UPDATE my_table
SET col_a = col_b, col_b = col_a;

In standard SQL, UPDATE evaluates all SET expressions using the old values before applying changes, so this swaps correctly in a single statement.

Q: Write a query to find consecutive login days for each user (streak detection).

WITH grouped AS (
  SELECT user_id, login_date,
    login_date - INTERVAL '1 day' *
      ROW_NUMBER() OVER (
        PARTITION BY user_id ORDER BY login_date
      ) AS grp
  FROM logins
)
SELECT user_id,
  MIN(login_date) AS streak_start,
  MAX(login_date) AS streak_end,
  COUNT(*) AS streak_days
FROM grouped
GROUP BY user_id, grp
HAVING COUNT(*) >= 3
ORDER BY streak_days DESC;

The trick: subtracting a row number from the date produces the same value for consecutive days, creating a grouping key.