paint-brush
SQL Window Functions: A Quick Review With Examplesby@luca1iu
277 reads New Story

SQL Window Functions: A Quick Review With Examples

by Luca LiuJanuary 5th, 2025
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Window functions are a powerful feature in SQL used to perform calculations across a set of rows related to the current row. Unlike aggregate functions, window functions do not group rows into a single output. They return a result for each row while maintaining the context of the dataset.
featured image - SQL Window Functions: A Quick Review With Examples
Luca Liu HackerNoon profile picture


Introduction

Window functions are a powerful feature in SQL used to perform calculations across a set of rows related to the current row. Unlike aggregate functions, window functions do not group rows into a single output; they return a result for each row while maintaining the context of the dataset.


In this article, we’ll explore some commonly used SQL window functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG()) with examples.

Sample Table: Sales Data

We’ll use the following Sales table to demonstrate window functions:

SalesID

CustomerID

Product

Region

Amount

SaleDate

1

101

Laptop

North

1200

2023-01-05

2

102

Tablet

North

800

2023-02-15

3

103

Phone

North

800

2023-03-10

4

104

Tablet

North

500

2023-04-01

5

105

Laptop

South

1300

2023-05-05

6

106

Tablet

South

700

2023-06-20

7

107

Phone

West

900

2023-07-15

8

108

Laptop

East

1300

2023-08-10

1. ROW_NUMBER()

The ROW_NUMBER() function assigns a unique number to each row within a partition, ordered by a specified column.


Task: Assign a unique row number to each sale within a region based on the sale amount (highest to lowest).

SELECT SalesID, Region, Amount,
       ROW_NUMBER() OVER (PARTITION BY Region ORDER BY Amount DESC) AS RowNum
FROM Sales;

Result:

SalesID

Region

Amount

RowNum

1

North

1200

1

2

North

800

2

3

North

800

3

4

North

500

4

5

South

1300

1

6

South

700

2

7

West

900

1

8

East

1300

1

2. RANK()

The RANK() function assigns a rank to each row within a partition. Rows with the same values receive the same rank, and the next rank is skipped.


Task: Rank sales within each region by amount (highest to lowest).

SELECT SalesID, Region, Amount,
       RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS Rank
FROM Sales;

Result:

SalesID

Region

Amount

Rank

1

North

1200

1

2

North

800

2

3

North

800

2

4

North

500

4

5

South

1300

1

6

South

700

2

7

West

900

1

8

East

1300

1

Key Feature:

  • For the North region, both Amount = 800 rows share rank 2.


  • The next rank is skipped (i.e., rank 3 is missing) and jumps to 4.

3. DENSE_RANK()

The DENSE_RANK() function assigns ranks like RANK(), but it doesn’t skip ranks after ties.


Task: Assign dense ranks to sales within each region by amount (highest to lowest).

SELECT SalesID, Region, Amount,
       DENSE_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS DenseRank
FROM Sales;

Result:

SalesID

Region

Amount

DenseRank

1

North

1200

1

2

North

800

2

3

North

800

2

4

North

500

3

5

South

1300

1

6

South

700

2

7

West

900

1

8

East

1300

1

Key Feature:

  • For the North region, both Amount = 800 rows share rank 2.


  • The next rank is 3, with no skipping of ranks.

4. NTILE()

NTILE() divides rows into a specified number of approximately equal groups.


Task: Divide all sales into 4 groups based on Amount in descending order.

SELECT SalesID, Amount,
       NTILE(4) OVER (ORDER BY Amount DESC) AS Quartile
FROM Sales;

Result:

SalesID

Amount

Quartile

5

1300

1

8

1300

1

1

1200

2

7

900

2

2

800

3

3

800

3

4

500

4

6

700

4

5. LEAD()

LEAD() retrieves the value from the next row within the same partition.


Task: Compare each sale amount to the next sale amount, ordered by SaleDate.

SELECT SalesID, Amount, 
       LEAD(Amount) OVER (ORDER BY SaleDate) AS NextAmount
FROM Sales;

Result:

SalesID

Amount

NextAmount

1

1200

800

2

800

800

3

800

500

4

500

1300

5

1300

700

6

700

900

7

900

1300

8

1300

NULL

6. LAG()

LAG() retrieves the value from the previous row within the same partition.


Task: Compare each sale amount to the previous sale amount, ordered by SaleDate.

SELECT SalesID, Amount, 
       LAG(Amount) OVER (ORDER BY SaleDate) AS PrevAmount
FROM Sales;

Result:

SalesID

Amount

PrevAmount

1

1200

NULL

2

800

1200

3

800

800

4

500

800

5

1300

500

6

700

1300

7

900

700

8

1300

900

Conclusion

SQL window functions like ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG() provide powerful ways to analyze data within partitions.


Key Takeaways:

  • ROW_NUMBER() assigns a unique identifier for each row.
  • RANK() and DENSE_RANK() differ in how they handle ties (skipping vs. no skipping).
  • NTILE() is useful for dividing rows into statistic groups.
  • LEAD() and LAG() allow comparisons with adjacent rows.


By mastering these functions, you can handle complex analytics and ranking tasks effectively!


Thank you for taking the time to explore data-related insights with me. I appreciate your engagement. If you find this information helpful, I invite you to follow me or connect with me on LinkedIn. Happy exploring!👋