Introduction
Window functions are a powerful feature in SQL used to perform calculations across a set of rows related to the current row. Unlike aggregate functions, window functions do not group rows into a single output; they return a result for each row while maintaining the context of the dataset.
In this article, we’ll explore some commonly used SQL window functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG()) with examples.
Sample Table: Sales Data
We’ll use the following Sales table to demonstrate window functions:
| SalesID | CustomerID | Product | Region | Amount | SaleDate | 
|---|---|---|---|---|---|
| 1 | 101 | Laptop | North | 1200 | 2023-01-05 | 
| 2 | 102 | Tablet | North | 800 | 2023-02-15 | 
| 3 | 103 | Phone | North | 800 | 2023-03-10 | 
| 4 | 104 | Tablet | North | 500 | 2023-04-01 | 
| 5 | 105 | Laptop | South | 1300 | 2023-05-05 | 
| 6 | 106 | Tablet | South | 700 | 2023-06-20 | 
| 7 | 107 | Phone | West | 900 | 2023-07-15 | 
| 8 | 108 | Laptop | East | 1300 | 2023-08-10 | 
1. ROW_NUMBER()
The ROW_NUMBER() function assigns a unique number to each row within a partition, ordered by a specified column.
Task: Assign a unique row number to each sale within a region based on the sale amount (highest to lowest).
SELECT SalesID, Region, Amount,
       ROW_NUMBER() OVER (PARTITION BY Region ORDER BY Amount DESC) AS RowNum
FROM Sales;
Result:
| SalesID | Region | Amount | RowNum | 
|---|---|---|---|
| 1 | North | 1200 | 1 | 
| 2 | North | 800 | 2 | 
| 3 | North | 800 | 3 | 
| 4 | North | 500 | 4 | 
| 5 | South | 1300 | 1 | 
| 6 | South | 700 | 2 | 
| 7 | West | 900 | 1 | 
| 8 | East | 1300 | 1 | 
2. RANK()
The RANK() function assigns a rank to each row within a partition. Rows with the same values receive the same rank, and the next rank is skipped.
Task: Rank sales within each region by amount (highest to lowest).
SELECT SalesID, Region, Amount,
       RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS Rank
FROM Sales;
Result:
| SalesID | Region | Amount | Rank | 
|---|---|---|---|
| 1 | North | 1200 | 1 | 
| 2 | North | 800 | 2 | 
| 3 | North | 800 | 2 | 
| 4 | North | 500 | 4 | 
| 5 | South | 1300 | 1 | 
| 6 | South | 700 | 2 | 
| 7 | West | 900 | 1 | 
| 8 | East | 1300 | 1 | 
Key Feature:
- For the North region, both Amount = 800 rows share rank 2.
- The next rank is skipped (i.e., rank 3 is missing) and jumps to 4.
3. DENSE_RANK()
The DENSE_RANK() function assigns ranks like RANK(), but it doesn’t skip ranks after ties.
Task: Assign dense ranks to sales within each region by amount (highest to lowest).
SELECT SalesID, Region, Amount,
       DENSE_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS DenseRank
FROM Sales;
Result:
| SalesID | Region | Amount | DenseRank | 
|---|---|---|---|
| 1 | North | 1200 | 1 | 
| 2 | North | 800 | 2 | 
| 3 | North | 800 | 2 | 
| 4 | North | 500 | 3 | 
| 5 | South | 1300 | 1 | 
| 6 | South | 700 | 2 | 
| 7 | West | 900 | 1 | 
| 8 | East | 1300 | 1 | 
Key Feature:
- For the North region, both Amount = 800 rows share rank 2.
- The next rank is 3, with no skipping of ranks.
4. NTILE()
NTILE() divides rows into a specified number of approximately equal groups.
Task: Divide all sales into 4 groups based on Amount in descending order.
SELECT SalesID, Amount,
       NTILE(4) OVER (ORDER BY Amount DESC) AS Quartile
FROM Sales;
Result:
| SalesID | Amount | Quartile | 
|---|---|---|
| 5 | 1300 | 1 | 
| 8 | 1300 | 1 | 
| 1 | 1200 | 2 | 
| 7 | 900 | 2 | 
| 2 | 800 | 3 | 
| 3 | 800 | 3 | 
| 4 | 500 | 4 | 
| 6 | 700 | 4 | 
5. LEAD()
LEAD() retrieves the value from the next row within the same partition.
Task: Compare each sale amount to the next sale amount, ordered by SaleDate.
SELECT SalesID, Amount, 
       LEAD(Amount) OVER (ORDER BY SaleDate) AS NextAmount
FROM Sales;
Result:
| SalesID | Amount | NextAmount | 
|---|---|---|
| 1 | 1200 | 800 | 
| 2 | 800 | 800 | 
| 3 | 800 | 500 | 
| 4 | 500 | 1300 | 
| 5 | 1300 | 700 | 
| 6 | 700 | 900 | 
| 7 | 900 | 1300 | 
| 8 | 1300 | NULL | 
6. LAG()
LAG() retrieves the value from the previous row within the same partition.
Task: Compare each sale amount to the previous sale amount, ordered by SaleDate.
SELECT SalesID, Amount, 
       LAG(Amount) OVER (ORDER BY SaleDate) AS PrevAmount
FROM Sales;
Result:
| SalesID | Amount | PrevAmount | 
|---|---|---|
| 1 | 1200 | NULL | 
| 2 | 800 | 1200 | 
| 3 | 800 | 800 | 
| 4 | 500 | 800 | 
| 5 | 1300 | 500 | 
| 6 | 700 | 1300 | 
| 7 | 900 | 700 | 
| 8 | 1300 | 900 | 
Conclusion
SQL window functions like ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG() provide powerful ways to analyze data within partitions.
Key Takeaways:
- ROW_NUMBER()assigns a unique identifier for each row.
- RANK()and- DENSE_RANK()differ in how they handle ties (skipping vs. no skipping).
- NTILE()is useful for dividing rows into statistic groups.
- LEAD()and- LAG()allow comparisons with adjacent rows.
By mastering these functions, you can handle complex analytics and ranking tasks effectively!
Thank you for taking the time to explore data-related insights with me. I appreciate your engagement. If you find this information helpful, I invite you to follow me or connect with me on LinkedIn. Happy exploring!👋
