SQL Window Functions

TL;DR

Window functions calculate across a set of rows related to the current row — without collapsing them into one like GROUP BY does. Use OVER() to define the window. Add PARTITION BY to group and ORDER BY to sort within the window.

Explain Like I'm 12

Imagine your class got test scores back. GROUP BY is like: "What's the average score per subject?" — you get one number per subject, and all the individual scores disappear.

Window functions are like: "Show me everyone's score AND their rank in the class, AND how they compare to the average." Every row stays — you just add extra columns with calculations based on the group.

How Window Functions Work

Window function flow: All Rows, PARTITION BY splits into groups, ORDER BY sorts within partition, Window Function calculates per row keeping all rows, vs GROUP BY which collapses rows

The Syntax

function_name() OVER (
  PARTITION BY column    -- optional: split into groups
  ORDER BY column        -- optional: sort within group
)
Key insight: OVER() is what makes a function a window function. Without it, SUM() and COUNT() are just aggregate functions.

Sample Data

We'll use this sales table for every example:

idnamedepartmentsalary
1AliceEngineering95000
2BobEngineering90000
3CharlieEngineering90000
4DianaMarketing85000
5EveMarketing80000
6FrankSales70000

Ranking Functions

The three ranking functions differ only in how they handle ties:

SELECT name, department, salary,
  ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
  RANK()       OVER (ORDER BY salary DESC) AS rank,
  DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
namesalaryrow_numrankdense_rank
Alice95000111
Bob90000222
Charlie90000322
Diana85000443
Eve80000554
Frank70000665
How to remember:
  • ROW_NUMBER — always unique, like a ticket number (1, 2, 3, 4...)
  • RANK — ties share a rank, then skips (1, 2, 2, 4...)
  • DENSE_RANK — ties share a rank, no gaps (1, 2, 2, 3...)

PARTITION BY — Rank Within Groups

PARTITION BY is like GROUP BY for window functions. It splits the rows into groups, and the function runs independently within each group.

-- Rank employees within their department
SELECT name, department, salary,
  RANK() OVER (
    PARTITION BY department
    ORDER BY salary DESC
  ) AS dept_rank
FROM employees;
namedepartmentsalarydept_rank
AliceEngineering950001
BobEngineering900002
CharlieEngineering900002
DianaMarketing850001
EveMarketing800002
FrankSales700001

Notice how the rank resets to 1 for each department. That's PARTITION BY at work.

LAG and LEAD — Look at Neighboring Rows

LAG looks at the previous row. LEAD looks at the next row. Perfect for comparisons over time.

-- Compare each employee's salary to the next lower one
SELECT name, salary,
  LAG(salary, 1)  OVER (ORDER BY salary DESC) AS prev_higher,
  LEAD(salary, 1) OVER (ORDER BY salary DESC) AS next_lower,
  salary - LEAD(salary, 1) OVER (ORDER BY salary DESC) AS gap
FROM employees;
namesalaryprev_highernext_lowergap
Alice95000NULL900005000
Bob9000095000900000
Charlie9000090000850005000
Diana8500090000800005000
Eve80000850007000010000
Frank7000080000NULLNULL
Real-world use: LAG/LEAD are commonly used for month-over-month comparisons, calculating growth rates, and finding streaks in time-series data.

Aggregate Window Functions

Any aggregate function (SUM, AVG, COUNT, MIN, MAX) becomes a window function when you add OVER().

SELECT name, department, salary,
  AVG(salary) OVER (PARTITION BY department) AS dept_avg,
  salary - AVG(salary) OVER (PARTITION BY department) AS diff_from_avg,
  SUM(salary) OVER () AS company_total
FROM employees;
Empty OVER(): OVER() with no PARTITION BY treats the entire result set as one window. Useful for totals and percentages.

Running Totals

-- Running total of salary ordered by hire date
SELECT name, salary,
  SUM(salary) OVER (ORDER BY hire_date) AS running_total
FROM employees;

Common Patterns

Top-N per Group

One of the most common interview questions: "Get the top 3 earners per department."

SELECT * FROM (
  SELECT name, department, salary,
    ROW_NUMBER() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS rn
  FROM employees
) ranked
WHERE rn <= 3;

Percentage of Total

SELECT name, department, salary,
  ROUND(100.0 * salary / SUM(salary) OVER (), 1) AS pct_of_total,
  ROUND(100.0 * salary / SUM(salary) OVER (PARTITION BY department), 1) AS pct_of_dept
FROM employees;

Deduplication

-- Keep only the latest record per user
DELETE FROM events
WHERE id NOT IN (
  SELECT id FROM (
    SELECT id,
      ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) AS rn
    FROM events
  ) t WHERE rn = 1
);

GROUP BY vs Window Functions

FeatureGROUP BYWindow Function
Rows in resultOne per groupAll original rows kept
Access to individual rowsNo — collapsedYes — each row visible
Multiple aggregationsAll same groupingEach can have different PARTITION
RankingNot possibleROW_NUMBER, RANK, etc.
Neighbor accessNot possibleLAG, LEAD
PerformanceGenerally fasterCan be slower on large sets

Test Yourself

Q: What's the difference between RANK and DENSE_RANK?

Both assign the same rank to ties. RANK skips numbers after ties (1, 2, 2, 4). DENSE_RANK doesn't skip (1, 2, 2, 3).

Q: What does PARTITION BY do?

Splits the result set into groups. The window function runs independently within each group, then the results are combined back together.

Q: How do LAG and LEAD differ?

LAG looks at the previous row (before the current row in the window order). LEAD looks at the next row. Both return NULL at the boundary.

Q: How do you make SUM() a window function instead of an aggregate?

Add OVER() after it: SUM(salary) OVER (PARTITION BY department). Without OVER, it collapses rows. With OVER, every row stays.

Interview Questions

Q: Write a query to find the 2nd highest salary in each department.

SELECT * FROM (
  SELECT name, department, salary,
    DENSE_RANK() OVER (
      PARTITION BY department ORDER BY salary DESC
    ) AS dr
  FROM employees
) t WHERE dr = 2;

Use DENSE_RANK (not RANK) so ties don't cause you to skip the 2nd position.

Q: How would you calculate month-over-month revenue growth as a percentage?

SELECT month, revenue,
  LAG(revenue) OVER (ORDER BY month) AS prev_month,
  ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month))
    / LAG(revenue) OVER (ORDER BY month), 1) AS growth_pct
FROM monthly_revenue;

Q: What's the difference between a window function and a GROUP BY? When would you choose one over the other?

GROUP BY collapses rows into one per group — you lose individual row data. Window functions keep all rows and add the calculation as a new column. Choose GROUP BY when you want summary rows. Choose window functions when you need both the detail AND the aggregate (e.g., showing each employee's salary alongside their department average).

Q: How do you remove duplicate rows keeping only the most recent one per user?

WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY user_id ORDER BY created_at DESC
  ) AS rn
  FROM events
)
SELECT * FROM ranked WHERE rn = 1;

ROW_NUMBER (not RANK) guarantees exactly one row per user, even if timestamps tie.

Q: Can you use a window function in a WHERE clause? Why or why not?

No. Window functions execute after WHERE in the SQL execution order (they run at the SELECT stage). To filter by a window function result, wrap the query in a subquery or CTE and filter in the outer WHERE.