The Fight Over Ad Spend Marketing budgets are often treated like mysterious black boxes -  money goes in, results (sometimes) come out, and CFO raises an eyebrow either way. For years, we’ve tried to peek inside that box with attribution models, click tracking, and dashboards promising to connect every dollar to every user. The truth? Those tools were fragile even before privacy rules tightened and cookies crumbled. Today, they’re almost useless. That’s where three letters step in: MMM, or Marketing Mix Modeling. It’s not new — in fact, big brands were using it decades ago — but in 2025, it feels like MMM is having a second life. It doesn’t need user-level tracking, it works with the data you already have, and when done right, it gives you a clearer picture of how each channel pulls its weight. MMM, or Marketing Mix Modeling What is Marketing Mix Modeling? Media Mix Modelling is a statistical analysis method used in marketing to determine the optimal allocation of resources across various advertising or media channels to maximise the effectiveness of a marketing campaign. The goal of media mix modelling is to understand the impact of different marketing channels on the overall campaign effectiveness and efficiently allocate the budget based on the contribution of each channel. Join me to discover how to optimise the marketing budget by implementing Robyn MMM. Adstock and Saturation In marketing, there are two main effects - adstock and saturation. Let's briefly discuss each of them. Adstock Not all marketing actions have an immediate effect. A user may install an application/register on an online store, but it may take some time before a user becomes a customer. This effect is called adstock. Saturation or Diminishing Returns This effect states that the next invested dollar brings you less effect than the previous one. This can, in fact, work in two ways: the next spent dollar may, on the contrary, bring you a greater effect. For example, if your maximum bid for a user on some ad auction is 30, and your competitor's is 31, the next dollar spent may bring you, users, from your competitor. Before Modelling Data Collection What we want to collect: Current marketing channel expenditures. If available down to the campaign level, even better.
Dependent variable. Example: New users/purchases/purchases only from new users. In general, the most important KPI in business.
Other metrics affecting the business but not dependent on you: exchange rates, weather, inflation. Current marketing channel expenditures. If available down to the campaign level, even better. Dependent variable. Example: New users/purchases/purchases only from new users. In general, the most important KPI in business. Other metrics affecting the business but not dependent on you: exchange rates, weather, inflation. How we want to collect it: on a daily level, at least for the past year. If data is available only weekly, it can be collected weekly but for three years. Here's an example of what the data looks like before modelling: Modelling Now let's dive into modelling using specific libraries, starting with Robyn. I will refer to examples from the demo library. How It Works Modelling consists of several parts. I've highlighted three stages of modelling: Data Preparation
Selection of marketing effect parameters (adstock and saturation)
Selection of modelling parameters
Analysis and selection of the final model Data Preparation Selection of marketing effect parameters (adstock and saturation) Selection of modelling parameters Analysis and selection of the final model This article will cover points 1, 2, and 3, as the 4th point heavily depends on the company you work for, communication with the marketing team, and other factors. Data Preparation Remember I mentioned that you might need conversion data for your channels? Let's try to incorporate that data. Further on, I would like to share some tips on using Robyn MMM. I have encountered these questions myself. Of course, they do not apply to all the cases, as they require additional data for analysis. We'll discuss how you can use your last-touch attribution data to understand better how to customise your Robin MMM model. Channel Breakdown Usually, channel data is divided into two levels - channels and campaigns. For the next analysis, we will need campaign and conversion data. If you have last-touch attribution data available, you can use it to plot saturation and adstock graphs by campaigns. The rule here is simple, you need to try and identify channels based on similar adstocks and saturations. Also, remember that last-touch attribution is not the most reliable attribution method, and you may have channels and campaigns that are not tracked as well as you would expect. As you can see, looking at the curves we can conclude that most likely Channel 0 and Channel 1 can be merged. More about Robin's Hyperparameters Tuning Adstock Parameters After merging channels, we must tune the adstock and saturation parameters that we will feed into the Robin model. For this purpose, I used this small piece of code: import scipy.stats as s
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
from scipy.optimize import leastsq

## my distribution (Inverse Normal with shape parameter mu=1.0)

def weib(x,l,k):
 
  return (k / l) (x / l)*(k-1) np.exp(-(x/l)*k)

def residuals(p,x,y):
  integral = quad( weib, 0, 16*4096, args=(p[0],p[1]) )[0]
  penalization = abs(1.-integral)*100
  return y - weib(x, p[0],p[1]) + penalization

number_of_days_of_adstock = 14

## in data we have table on user level where one row represent one use

## wich is been attributed to media_source_mmm channel and

## registred in days_since_add_seen since he saw an ad from this channel

data = (df[(df['media_source_mmm'] == 'media_source_name') &
  (df['days_since_add_seen'] < number_of_days_of_adstock)]
  ['days_since_add_seen']
)

x = np.linspace(data.min(), data.max(), number_of_days_of_adstock)
n, bins, patches = plt.hist(data,bins=x,density=False)
n = (n/n.max())[1:]
bins = bins[1:-1]
binsm = (bins[1:]+bins[:-1])/2

## finding parameters that fit best for the data

popt, pcov = leastsq(func=residuals, x0=(0.75,0.8), args=(bins,n), factor = 1)
plt.clf()
plt.bar(bins, n)
wb = weib(x,*popt)
wb[wb == np.inf] = 1

plt.plot(x, wb, label=f'{media_source}' + ' weib leastsq, scale=%1.3f, shape=%1.3f' % tuple(popt), lw=4., color = 'orange')
plt.legend(loc='upper right')
plt.show() import scipy.stats as s
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
from scipy.optimize import leastsq

## my distribution (Inverse Normal with shape parameter mu=1.0)

def weib(x,l,k):
 
  return (k / l) (x / l)*(k-1) np.exp(-(x/l)*k)

def residuals(p,x,y):
  integral = quad( weib, 0, 16*4096, args=(p[0],p[1]) )[0]
  penalization = abs(1.-integral)*100
  return y - weib(x, p[0],p[1]) + penalization

number_of_days_of_adstock = 14

## in data we have table on user level where one row represent one use

## wich is been attributed to media_source_mmm channel and

## registred in days_since_add_seen since he saw an ad from this channel

data = (df[(df['media_source_mmm'] == 'media_source_name') &
  (df['days_since_add_seen'] < number_of_days_of_adstock)]
  ['days_since_add_seen']
)

x = np.linspace(data.min(), data.max(), number_of_days_of_adstock)
n, bins, patches = plt.hist(data,bins=x,density=False)
n = (n/n.max())[1:]
bins = bins[1:-1]
binsm = (bins[1:]+bins[:-1])/2

## finding parameters that fit best for the data

popt, pcov = leastsq(func=residuals, x0=(0.75,0.8), args=(bins,n), factor = 1)
plt.clf()
plt.bar(bins, n)
wb = weib(x,*popt)
wb[wb == np.inf] = 1

plt.plot(x, wb, label=f'{media_source}' + ' weib leastsq, scale=%1.3f, shape=%1.3f' % tuple(popt), lw=4., color = 'orange')
plt.legend(loc='upper right')
plt.show() After running this script, all we need is to plug the obtained scale and shape into the Robyn hyperparameters. However, it's important to remember that last-touch attribution data isn't always what you want to base your modelling on, so when choosing parameters, it's necessary to consider the difference in channel tracking capabilities - making the scale and shape ranges wider for those channels that are tracked poorly and narrower for those that are tracked well and you trust more. Saturation Parameters Selection This is, perhaps, one of the most challenging topics in all of MMM. The problem is that real data tends to look something like this: In theory, the simplest approach we can take is replicating the code above, we can just substitute the required function instead of "weib". However, we need to take into account that our data already has some history, and we may also want to exclude from our analysis the data that we consider outdated. It is important to try to strike a balance between the "freshness" of the data and the representativeness of the sample. With a small set of very fresh data, we may obtain a biassed estimate of the saturation function, while with a large volume of historical data, we may shift it towards a direction that is no longer relevant. An alternative approach could be to fit two lines - an upper and lower estimate of the transformation function so that the data lies within these functions. This is how it will look: Model Selection Parameters The main modelling parameters in Robyn are as follows: iterations
trials
ts_validation
add_penalty_factor
intercept
intercept_sign
rssd_zero_penalty iterations iterations trials trials ts_validation ts_validation add_penalty_factor add_penalty_factor intercept intercept intercept_sign intercept_sign rssd_zero_penalty rssd_zero_penalty Unfortunately, there is no way to set the first two parameters without experimenting with your data, and the only way to properly select them is empirically. All the rest we can discuss in a more detailed way. During the modelling stage, ts_validation = True will always be needed. If you want to be able to validate the results of your experiment on a holdout set, then this parameter is what you need as it is responsible for that. ts_validation = True The add_penalty_factor parameter controls regularisation, helping you avoid overfitting. I always try to use it, but again, I cannot provide direct recommendations without experimentation. add_penalty_factor intercept reflects the constant value in the model, essentially representing the pure organic aspect of your business. That is, if you remove market factors and all your advertising expenses, you will receive "Intercept" orders. intercept = True. intercept = True intercept_sign determines the sign of Intercept. Usually, it is always "positive." intercept_sign rssd_zero_penalty penalises the model if it doesn't attribute any effect to a channel. If you believe that such channels exist in your MMM, why not simply remove them from modelling and set rssd_zero_penalty = 'True'? rssd_zero_penalty rssd_zero_penalty = 'True' These are just a few examples of how you can finetune your analysis. I hope the above tips were helpful, feel free to share your experience on how you optimise your budgets with MMM.

How to Optimize Your Marketing Budget Using Just Three Letters: MMM

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Exploring Machine Learning Techniques for LTV/CLV Prediction

Exploring Machine Learning Techniques for LTV/CLV Prediction

How to Run Impact Analysis Without an A/B Test?

How Three ML Models Transform Product Analytics

What Are Convolution Neural Networks? [ELI5]

The Noonification: Have U Been Pwned? (1/12/2023)

Exploring Machine Learning Techniques for LTV/CLV Prediction

Exploring Machine Learning Techniques for LTV/CLV Prediction

How to Run Impact Analysis Without an A/B Test?

How Three ML Models Transform Product Analytics

What Are Convolution Neural Networks? [ELI5]

The Noonification: Have U Been Pwned? (1/12/2023)

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps