When we have multiple customer acquisition channels and the marketing budget grows rapidly, as such we want to know as soon as possible which of the new channels is worth the investment. Lifetime Value (LTV) modeling may be a useful tool for such a task.
In this article, I would like to share a case from my practice of an LTV forecast for retail clients in the financial industry that changed a company’s approach to marketing budget allocation (for customer acquisition).
Imagine that you work for a rapidly growing company that acquires new customers through different channels. Of course, many of those channels don't generate new clients for free: you have to pay for advertising whether it is context or display advertising, cross-links with partner websites, mentions in blogs, social network posts, etc.
When our marketing budget is way below our profits, we may not even care about acquisition costs. But when our business is in the growth phase, getting clients is costly and it is a matter of life or death to extract as much as possible from every dollar invested in customer acquisition.
You may say: "Ok if we spend a lot, we should compare the so-called CAC (customer acquisition cost) to our average profit generated by an average client coming from that specific channel".
And you are absolutely right:
All the mess happens when it comes to calculating the average profit generated by clients coming from specific channels:
These issues arose in a financial company where I worked, and I would like to share how I approached solving them by creating a system that could evaluate LTV vs CAC for new clients within just a few days after a new acquisition channel launch.
Technically speaking, we wanted to create a model that could forecast customer lifetime value for new clients coming from acquisition channels (or subchannels like specific marketing campaigns). Generally, as a rule of thumb, we look for situations where:
LTV > 3 x CAC
What is CAC? The calculation of this metric is straightforward:
CAC = Channel cost / Number of clients from a channel
What is LTV?
LTV = total (PV(Revenue)) / Number of clients from a channel
Revenue represents the amount of money a client generates over a lifetime. The discussion on how to calculate Revenue (or whether it should be profit with all the cost allocations) is beyond the scope of this story. The specific methodology mainly depends on the business model under consideration and may even deserve an extra article.
PV stands forpresent valuewhich is applied to discount the Revenue at a certain rate (we remember the time value of money: costs for client acquisition are paid today but clients generate income within some time frame). You can easily google how to calculate PV based on future cash flows.
We always have all the necessary data to calculate CAC. This is not the case with LTV. To evaluate this indicator we need to forecast future Revenue for each client:
Revenue forecast = f(Xi)
Where Xi stands for all the client parameters that we have, namely:
Obviously, we have historical data for (Revenue_i, X_i) for all our existing clients, which gives us the opportunity to build machine learning models predicting Revenue_i based on X_i. We need to define some time frame for Revenue forecasting. In our case, it is six months. This parameter can vary significantly for other cases and is subject to discussion.
NOTE: We also need to define a period after a client's registration for which we build our parameter array and make the first forecast. The shorter this period is, the faster we can evaluate the average LTV for a channel to compare it with CAC. But on the other hand, the shorter this period is, the less information there is about a client for training our machine learning model and the less predictive power we have.
In our case, three days was a pretty good trade-off (let's denote it by T = 3). We should also remember that we always need to collect a statistically significant number of clients to calculate an average LTV.
In practice, the task was solved a bit more differently than explained here, but it does not change the whole approach: we always have some “older” clients than defined X days for the time frame, so predictions for them may be better simply because we have more information about their behavior.
So, having historical data (Revenue_i, X_i) for thousands of our existing and former clients, we can train a machine learning algorithm like gradient boosting (or neural network if it fits the data better) to do the forecasting job. Model selection and tuning has nowadays become a pretty simple routine. You can just google xgboost or catboost examples and find end-to-end code on the first search results page.
Now we have a model that makes predictions (don't forget to carry out backtesting and evaluate its performance for out-of-sample data). You have several options for deployment:
In our case option, #2 (running a model on a local machine) was chosen.
Having a Revenue forecasting model we can easily calculate the expected LTV for each client = PV(Revenue forecast) for a 6-month time frame. Once we have enough clients from a channel (or a marketing campaign), we can compare average LTV to CAC and make the decision whether to keep that channel (or campaign) or not. In our case, we managed to reduce the evaluation of a channel to 3 days on average (instead of 6 months of waiting to understand each client’s real value).
P.S. This may seem like a solved task, but in fact, it is just the beginning of a journey. Next, we need to start monitoring our model’s performance, re-train it on a regular basis, and at some point maybe even question the entire approach.