How Online Stores Know What You’ll Buy Next: The Math Behind “Frequently Bought Together”

Written by rajachakraborti25 | Published 2025/10/31
Tech Story Tags: machine-learning | recommendation-algorithm | association-rule-mining | frequently-bought-together | item-recommendations | ecommerce | ecommerce-store | ecommerce-marketplace

TLDRAssociation Rule Mining helps computers find patterns automatically from huge amounts of data. It can be used to make better decisions like showing related products online or organizing store shelves smarter.via the TL;DR App

Amazon and other online retailers seem to know exactly what you’ll add to your cart next. How? In physical stores, products are organized into sections, aisles, and shelves, making it easy to find related items together. This real-world experience was missing from early digital stores, where breadcrumbs were the only way to navigate categories. E-commerce sites have only recently begun to replicate the convenience of finding similar products grouped together, just like in a physical store.

You're back-to-school shopping and add a notebook to your cart. Suddenly, you're shown pens, glue sticks, backpacks, printer ink, and scissors all the essentials, right there in one swipeable carousel. So convenient, way better than hunting them down aisle by aisle at Walmart.

But how do eCommerce sites know exactly what to show you?

🎉 Drumroll... Enter Association Rule Mining the behind-the-scenes magic of data mining that finds patterns, relationships, and "frequently bought together" combos across massive datasets.

Even though these products are made, shipped, and sold by totally different suppliers, this tech connects the dots — making your shopping experience smarter and smoother.

Association Rule Mining helps computers find these patterns automatically from huge amounts of data, so businesses can use them to make better decisions like showing related products online or organizing store shelves smarter.

Think of it like this: "If X, then Y" that's the heart of association rule mining.

The Apriori Algorithm is a classic method used in Association Rule Mining to find items that frequently appear together in large datasets like in shopping carts, web clicks, or user behaviors. So, if we have seen previous patterns where products being bought together are used as guiding tools to create these associations.

The Apriori Algorithm

Explanation

  • Purpose: Identify frequent itemsets and derive association rules.
  • Steps:
  • Identify all itemsets that meet a minimum support threshold.
  • Generate larger itemsets from smaller frequent itemsets.
  • Derive association rules from frequent itemsets that meet minimum confidence.
  • Mathematical Intuition:
  • Support: Frequency of itemset in dataset.
  • Confidence: Likelihood of consequent given antecedent.
  • Lift: How much more likely the consequent is, given the antecedent, compared to random chance.

Mental map

Term

Meaning

In Plain English

antecedents

The "if" part of the rule

What the customer has bought already

consequents

The "then" part of the rule

What the customer is likely to buy next

support

Frequency of this item combination in the whole dataset

How common this rule is overall

confidence

How often the rule has been true

If people buy A, how often they also buy B

lift

How much more likely B is given A (vs. just randomly buying B)

Measures strength of the rule (Lift > 1 is good)

Data Preprocessing & Visualization

The table below shows the likelihood of items being purchased together. For example, if a customer buys a SPACEBOY LUNCH BOX, they are also likely to buy a DOLLY GIRL LUNCH BOX. This pattern suggests that parents may be purchasing lunch boxes for both their son and daughter.

Similarly, if someone buys all these ROSES REGENCY TEACUP AND SAUCER , PINK REGENCY TEACUP AND SAUCER then they are likely to buy Green one as well to complete color combination of the teasets.

antecedents

consequents

support

confidence

lift

ALARM CLOCK BAKELIKE RED

ALARM CLOCK BAKELIKE GREEN

0.029

0.604

14.198

ALARM CLOCK BAKELIKE GREEN

ALARM CLOCK BAKELIKE RED

0.029

0.672

14.198

ALARM CLOCK BAKELIKE RED

ALARM CLOCK BAKELIKE PINK

0.021

0.452

13.654

ALARM CLOCK BAKELIKE PINK

ALARM CLOCK BAKELIKE RED

0.021

0.646

13.654

SPACEBOY LUNCH BOX

DOLLY GIRL LUNCH BOX

0.023

0.602

18.123

DOLLY GIRL LUNCH BOX

SPACEBOY LUNCH BOX

0.023

0.688

18.123

GARDENERS KNEELING PAD KEEP CALM

GARDENERS KNEELING PAD CUP OF TEA

0.025

0.612

17.877

GARDENERS KNEELING PAD CUP OF TEA

GARDENERS KNEELING PAD KEEP CALM

0.025

0.729

17.877

LUNCH BAG SPACEBOY DESIGN

LUNCH BAG BLACK SKULL.

0.023

0.423

7.455

LUNCH BAG SUKI DESIGN

LUNCH BAG BLACK SKULL.

0.022

0.455

8.016

LUNCH BAG BLACK SKULL.

LUNCH BAG SUKI DESIGN

0.022

0.389

8.016

LUNCH BAG RED RETROSPOT

LUNCH BAG APPLE DESIGN

0.021

0.302

6.464

LUNCH BAG APPLE DESIGN

LUNCH BAG RED RETROSPOT

0.021

0.449

6.464

LUNCH BAG CARS BLUE

LUNCH BAG PINK POLKADOT

0.023

0.442

8.801

LUNCH BAG PINK POLKADOT

LUNCH BAG CARS BLUE

0.023

0.459

8.801

LUNCH BAG CARS BLUE

LUNCH BAG RED RETROSPOT

0.025

0.474

6.823

LUNCH BAG RED RETROSPOT

LUNCH BAG CARS BLUE

0.025

0.356

6.823

LUNCH BAG CARS BLUE

LUNCH BAG SPACEBOY DESIGN

0.021

0.405

7.594

LUNCH BAG SPACEBOY DESIGN

LUNCH BAG CARS BLUE

0.021

0.396

7.594

LUNCH BAG CARS BLUE

LUNCH BAG SUKI DESIGN

0.021

0.404

8.324

LUNCH BAG SUKI DESIGN

LUNCH BAG CARS BLUE

0.021

0.434

8.324

LUNCH BAG PINK POLKADOT

LUNCH BAG RED RETROSPOT

0.028

0.562

8.084

LUNCH BAG RED RETROSPOT

LUNCH BAG PINK POLKADOT

0.028

0.406

8.084

LUNCH BAG RED RETROSPOT

LUNCH BAG SPACEBOY DESIGN

0.025

0.363

6.802

LUNCH BAG SPACEBOY DESIGN

LUNCH BAG RED RETROSPOT

0.025

0.473

6.802

LUNCH BAG SUKI DESIGN

LUNCH BAG RED RETROSPOT

0.024

0.501

7.204

LUNCH BAG RED RETROSPOT

LUNCH BAG SUKI DESIGN

0.024

0.349

7.204

LUNCH BAG RED RETROSPOT

LUNCH BAG WOODLAND

0.023

0.336

7.599

LUNCH BAG WOODLAND

LUNCH BAG RED RETROSPOT

0.023

0.528

7.599

LUNCH BAG SUKI DESIGN

LUNCH BAG SPACEBOY DESIGN

0.020

0.420

7.888

LUNCH BAG SPACEBOY DESIGN

LUNCH BAG SUKI DESIGN

0.020

0.383

7.888

LUNCH BAG WOODLAND

LUNCH BAG SPACEBOY DESIGN

0.022

0.494

9.266

LUNCH BAG SPACEBOY DESIGN

LUNCH BAG WOODLAND

0.022

0.410

9.266

PAPER CHAIN KIT 50'S CHRISTMAS

PAPER CHAIN KIT VINTAGE CHRISTMAS

0.024

0.460

12.239

PAPER CHAIN KIT VINTAGE CHRISTMAS

PAPER CHAIN KIT 50'S CHRISTMAS

0.024

0.647

12.239

PARTY BUNTING

SPOTTY BUNTING

0.021

0.282

5.209

SPOTTY BUNTING

PARTY BUNTING

0.021

0.388

5.209

ROSES REGENCY TEACUP AND SAUCER

PINK REGENCY TEACUP AND SAUCER

0.024

0.557

18.564

PINK REGENCY TEACUP AND SAUCER

ROSES REGENCY TEACUP AND SAUCER

0.024

0.784

18.564

WHITE HANGING HEART T-LIGHT HOLDER

RED HANGING HEART T-LIGHT HOLDER

0.025

0.231

6.302

RED HANGING HEART T-LIGHT HOLDER

WHITE HANGING HEART T-LIGHT HOLDER

0.025

0.670

6.302

REGENCY CAKESTAND 3 TIER

ROSES REGENCY TEACUP AND SAUCER

0.023

0.246

5.835

ROSES REGENCY TEACUP AND SAUCER

REGENCY CAKESTAND 3 TIER

0.023

0.536

5.835

WOODEN PICTURE FRAME WHITE FINISH

WOODEN FRAME ANTIQUE WHITE

0.025

0.534

12.211

WOODEN FRAME ANTIQUE WHITE

WOODEN PICTURE FRAME WHITE FINISH

0.025

0.577

12.211

ROSES REGENCY TEACUP AND SAUCER , PINK REGENCY TEACUP AND SAUCER

GREEN REGENCY TEACUP AND SAUCER

0.021

0.894

23.995

GREEN REGENCY TEACUP AND SAUCER, PINK REGENCY TEACUP AND SAUCER

ROSES REGENCY TEACUP AND SAUCER

0.021

0.848

20.071

ROSES REGENCY TEACUP AND SAUCER , GREEN REGENCY TEACUP AND SAUCER

PINK REGENCY TEACUP AND SAUCER

0.021

0.721

24.033

PINK REGENCY TEACUP AND SAUCER

ROSES REGENCY TEACUP AND SAUCER , GREEN REGENCY TEACUP AND SAUCER

0.021

0.701

24.033

ROSES REGENCY TEACUP AND SAUCER

GREEN REGENCY TEACUP AND SAUCER, PINK REGENCY TEACUP AND SAUCER

0.021

0.498

20.071

GREEN REGENCY TEACUP AND SAUCER

ROSES REGENCY TEACUP AND SAUCER , PINK REGENCY TEACUP AND SAUCER

0.021

0.564

23.995

Interpretation

  • antecedents: The item(s) on the left side of the rule (the "if" part). For example, (ALARM CLOCK BAKELIKE RED ) means the rule is considering transactions that include this item.
  • consequents: The item(s) on the right side of the rule (the "then" part). For example, (ALARM CLOCK BAKELIKE GREEN) means the rule is predicting that this item is also likely to be present.
  • support: The proportion of transactions that contain both the antecedent and the consequent. For example, 0.028593 means about 2.86% of all transactions contain both items.
  • confidence: The probability that the consequent is present given the antecedent is present. For example, 0.604333 means that if a transaction contains the antecedent, there is a 60.4% chance it also contains the consequent.
  • lift: How much more likely the consequent is to appear with the antecedent than it would be by chance. A lift greater than 1 means the rule is useful; for example, 14.197612 means the items appear together about 14 times more often than if they were independent.

Support: Proportion of transactions containing the itemset.

  • Confidence: Probability of consequent given antecedent.
  • Lift: Ratio of observed support to expected support if independent.
  • Conviction: Measure of implication strength.

Retailers use association rule mining to optimize product placement and promotions. For example, if analysis shows that customers who buy a SPACEBOY LUNCH BOX often also buy a DOLLY GIRL LUNCH BOX, the store can place these items together on shelves or offer bundle discounts. This increases the likelihood of customers purchasing both items, boosting sales and improving click count on their page.

Try Next:

We have used the Apriori algorithm to discover associations, but there are other effective options such as FP-Growth and Eclat. By creating a pipeline that runs the data through all these algorithms and applies a scoring system, we can select rules that are identified by at least two methods and have strong metrics. This approach increases confidence in the reliability of the discovered associations.

Practical Retail Application: Digital Shelf Optimization

This approach can be used to pre-populate related products in an online system, allowing users to quickly find complementary items when searching for a product. In a physical store, this is similar to placing frequently bought-together items side by side on shelves. By replicating this model digitally, we streamline the shopping experience, saving users time and effort while increasing the likelihood of additional purchases.

References-

  • UCI Machine Learning Repository: Online Retail
  • mlxtend documentation: http://rasbt.github.io/mlxtend/
  • Han, J., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques.

Written by rajachakraborti25 | I am a seasoned Full-Stack and Platform Senior Engineer at Ticketmaster
Published by HackerNoon on 2025/10/31