The world we know today thrives on data. From the simplest filing system on a corporate assistant’s desk to underwater data centers storing petabytes of information, data now powers a huge portion of the world’s technology. Given how central it is to our lives, let’s take a couple of minutes to explore something interesting.
To all the data-crunchers, product hackers, and machine learning tinkerers out there—
What if I told you two groups can each show a trend, but when you combine them, that trend reverses?
It’s not a data bug. It’s a feature.
Welcome to Simpson’s Paradox—where conditional probabilities and marginal probabilities live very, very different lives.
A Tale of Two Departments
Let’s play around with the admission stats of a fictional university to lay the premise.
Here’s the breakdown:
Department |
Gender |
Applicants |
Admitted |
Admission Rate |
---|---|---|---|---|
A |
Women |
20 |
19 |
95% |
A |
Men |
100 |
90 |
90% |
B |
Women |
180 |
9 |
5% |
B |
Men |
100 |
10 |
10% |
At a glance, women outperform men in Department A and slightly underperform in Department B. So far, so good. But what happens when we zoom out and look at the university’s overall admission rates?
- Men: 100 admitted out of 200 applicants → 50% admission rate
- Women: 28 admitted out of 200 applicants → 14% admission rate
This data isn’t inaccurate, but it is misleading.
Wait—how?
Because conditional probabilities don’t always play nicely with marginal totals.
In statistical terms:
The conditional probabilityP(Admit∣ Female, Dept) is higher than P(Admit∣ Male, Dept) in Department A, but that advantage gets wiped out when we aggregate everything—because of differing group sizes.
This is Simpson’s Paradox in action.
Let’s (Casually) Math This Out
Here’s the intuition:
- Women mostly applied to Department B, where everyone had a low chance of admission.
- Men mostly applied to Department A, where acceptance rates were high.
So when you aggregate the numbers, the group-level performance gets drowned out by a lurking variable—in this case, department choice, which acts as a confounder.
This flips the marginal probabilities:
- P(Admit∣ Female)<P(Admit∣ Male)
…even though:
- P(Admit∣ Female, A)>P(Admit∣ Male, A)
- P(Admit∣ Female, B)<P(Admit ∣Male, B)
In plain English: Aggregated data hides subgroup truths.
Why Engineers Should Care
“Cool read, but I’m building APIs—not admission systems. Why should I care?”
Because this exact paradox can sneak into every part of your pipeline.
- An A/B test shows a conversion lift? That lift could disappear when broken down by device or region.
- Your model performs better for Group A? Maybe only because Group B had fewer samples or noisier data.
- Your AI recommends treatments, loans, or jobs? It might optimize for misleading averages and miss group-level fairness issues entirely.
In other words, you might be shipping features optimized for the wrong metrics.
Beyond Statistics—A Cognitive Trap
Stepping out from the shoes of a statistician, Simpson’s Paradox can also be viewed as more than just a math trick. It’s a cognitive trap.
Humans and mostly machine learning models tend to assume that if something is true in parts, it must be true in the whole too. But, alas the Simpson’s paradox paints us a different story, in fact, the exact opposite. It reminds us that we don’t just need more data—we need the right lens to interpret it.
Implications in 2025
We now live in a world where AI systems:
- Make hiring recommendations,
- Rank students for scholarships,
- Detect fraud and evaluate creditworthiness.
And these systems are often trained on massive aggregated datasets. And this can potentially be dangerous. If the models are not sensitive to the hidden confounders, they end up reinforcing discriminatory patterns, mistake correlation for causation and more importantly make statistically ‘right’ but ethically wrong decisions.
Let Simpson’s Paradox serve as a gentle nudge: when working with data, never settle for the surface view. Look deeper. Question the aggregates. Find the context.
Because sometimes, the truth is hiding in the split.