In this article, I will describe a relatively simple method for launching a cross-sell (products offered to a customer in addition to those in the cart) feature in e-commerce, such as groceries or food delivery services, which we successfully implemented in the "Mnogo Lososya" mobile app. This is a basic collaborative filtering recommender system that combines user-based and item-based approaches and can be used in a variety of e-commerce projects, particularly those with a large number of SKUs, to provide a wide range of recommendations.
Mnogo Lososya, founded in 2018, is a chain of 50+ ghost and 250+ takeaway kitchens as well as an umbrella brand for multiple dish concepts. Our unique selling point is the 30-minute delivery of freshly cooked meals. We have been rapidly expanding and have recently passed 100k app MAU with over 100M RUB in gross monthly revenue.
The majority of our orders are made online, with one-third coming from our own mobile app and the other two-thirds coming from delivery services. The app is an important component of our product because it is one of the first points of contact, which, along with our service and the food itself, contributes to a better customer experience.
This solution was built entirely on Yandex Cloud services, but it could also be built on AWS because it also has all the necessary services. I am designating services in terms of AWS notation for the convenience of AWS users, which should be clear to many YC users as well.
The simplified architecture looks like this:
Users place orders through the mobile app. In the ERP system, orders are created and processed.
The orders are then copied to the data warehouse during an ETL process once per day at night. Each order includes information about the ordered products as well as the customer identifier.
SQL procedures calculate user preferences and product similarity. A more detailed description of the computation is provided below. The computation yields two collections in MongoDB with the following structure:
userPref collection
Phone: We use “phone” as a user identifier.
relatedDishes
lastUpdateDate. Date and time of the last recalculation
Example of a document:
productSimilarity collection
dishId
relatedDishes
lastUpdateDate: Date and time of the last recalculation
Example of a document
We implemented user preferences based on weighted sales history, with recent sales taking priority. Consider the following arbitrary user's sales history:
Product |
Sales |
When |
Time coef (1/months) |
Weighted sales |
---|---|---|---|---|
A |
1 |
this month |
1 |
1 |
B |
1 |
this month |
1 |
1 |
C |
4 |
1 month ago |
0,5 |
2 |
A |
4 |
1 month ago |
0,5 |
2 |
B |
3 |
4 months ago |
0,25 |
0,75 |
The user purchased both product B and product C four times. However, because the majority of product B sales took place four months ago, we prioritize more recent product C sales. The products are sorted by total weighted sales, which are the sum of weighted sales for each product.
Product |
Total weighted sales |
Rank |
---|---|---|
A |
3 |
1 |
B |
1,75 |
3 |
C |
2 |
2 |
The above example implies that the user prefers product A over product C and product C over product B.
The number of orders in which pairs of products were present is used to calculate product similarity. The result is calculated separately for each month, with the most recent months taking priority. As a result, we rank similar products for each product and store them in MongoDB, where product ID is an index for the collection.
The resulting goods recommendation list combines user preferences and similar products and sorts them according to some strategy, which is ascending sorting by rank. As a result, we simply combine all related product lists and sort them. We compute the average rank for repeating products. Here is an example:
We chose the following metrics to measure the cross-sell efficiency:
The average order value (AOV) of orders that included cross-sell dishes was higher than the AOV of orders that did not. The total sum of all products in the order is the order value, which is how much the customer pays for the order. Therefore, this metric indicates whether customers pay more for orders that include cross-sold dishes. This is the key metric because the increase in AOV is exactly what we expect from cross-selling.
Percentage of goods added from the cross-sell section in total sold goods. This is a secondary metric that is heavily influenced by the nature of the goods sold as well as the cross-sell strategy. Consider an electronics e-commerce store that cross-sells low-cost supplements like suitcases and charging cables to more expensive items in the cart like smartphones and laptops. Many supplements can be cross-sold to one main item in this case, and the metric can exceed 50%. Although our example does not include a wide variety of supplements, this metric demonstrates how cross-selling affects the final cart structure.
Percentage of orders containing cross-sell dishes. This is another secondary metric that displays the "popularity" of cross-sell, or how frequently customers purchase cross-sell recommended products.
The dataset below contains impersonal order data collected from December 2022 to January 2023 in one of MnogoLososya's operations cities.
https://github.com/alexchrn/cross-sell/blob/main/orders.csv
The dataset is compiled from a variety of sources, including AppMetrica (add-to-cart events) and the ERP system (order and payment statuses, discount and payment sums).
Dataset structure:
So here are the metrics values (derived from the above dataset using python).
The percentage of dishes added from the cross-sell section in total bought dishes – 3.97%
The percentage of orders containing cross-sell dishes – 10.46%
AOV of orders which included cross-sell dishes compared to AOV of orders which did not:
As can be seen, orders with cross-sell dishes have a higher AOV with a difference of 565 RUB. The average number of dishes in such orders is also higher, which is reasonable considering that the sole goal of cross-selling is to incentivize a customer to add more dishes to their cart.
Is the difference of 565 significant? We can use a t-test to see if this difference is due to chance. The Python scripy library has a method for this. This is a test for the null hypothesis that 2 independent samples have identical average (expected) values (1).
Thus, the p-value, or probability of the null hypothesis being true, is extremely low, and we reject the null hypothesis even at the 99% significance level. In other words, it is almost certain that the noticeable difference in mean order value is not coincidental, and orders with cross-sell meals generate more revenue.
Cross-selling can be an effective tool for increasing average order value even with simple collaborative filtering techniques. It can also be implemented relatively easily from a technical standpoint, thanks to AWS and other cloud providers' serverless services, as shown in this article.
Related materials: