Recommendation systems have become an integral and indispensable part of our lives. These intelligent algorithms are pivotal in shaping our online experiences, influencing the content we consume, the products we buy, and the services we explore. Whether we are streaming content on platforms like Netflix, discovering new music on Spotify, or shopping online, recommendation systems are quietly working behind the scenes to personalize and enhance our interactions.
The unique element of these recommendation systems is their ability to understand and predict our preferences based on historical behavior and user patterns. By analyzing our past choices, these systems curate tailored suggestions, saving us time and effort while introducing us to content/products that align with our interests. This enhances user satisfaction and fosters discovery, introducing us to new and relevant offerings that we might not have encountered otherwise.
At a high level, developers understand that these algorithms are powered by machine learning and deep learning systems (interchangeably called neural networks), but what if I tell you there is a way to build a recommendation engine without going through the pain of deploying your neural net or machine learning model?
This question is specifically relevant in the context of early and mid-stage startups because they don't have tons of structured data to train their models. And as we already know, most machine learning models will not give accurate predictions without proper training data.
I recently built and deployed a basic recommendation engine for a
I had an extensive
At a high level, we had the following requirements from an engineering perspective -
The system should be able to capture a user's interests in the form of keywords. The system should also be able to classify the level of interest a user has with specific keywords.
The system should be able to capture a user's interest in other users. It should be able to classify the level of interest a user has in content created by another user.
The system should be able to generate high-quality recommendations based on a user's interests.
The system should be able to ensure that the recommendations already viewed/rejected by the user shouldn't re-appear again for X number of days.
The system should have logic to ensure that the posts from the same creators aren't grouped on the same page. The system should try its best to ensure that if a user consumes ten posts (our page size), all of those should be from different creators.
The system should be fast. Less than 150 milliseconds of P99 latency.
All the other non-functional requirements, such as high availability, scalability, security, reliability, maintainability, etc, should be fulfilled.
Again, this is a highly oversimplified list of problem statements. In reality, the documents were 3000+ words long as they also covered a lot of edge cases and corner cases that can arise while integrating this recommendation engine into our existing systems. Let's move on to the solution.
I will discuss the solutions to the problem one by one and then will describe the overall working of the entire system.
For this, we created something called a
As you can see from the above image, we are storing a lot of information, such as the number of interactions (likes, comments, shares) and recency of these interactions (when they happened last) as relationship data between two users as well between a user and an interest. We are even storing the relationship between two different interest keywords. I used
These interest keywords are predominantly nouns. There is a system in place that breaks down the contents of a post into these keywords(nouns). It's powered byAWS comprehend; a natural-language processing (NLP) service that uses machine learning to break text into entities, key phrases, etc. Again, you can use any managed NLP services (several available) to accomplish the same. You don't need to learn or deploy your machine-learning models! If you already understand machine learning, then you can go check
The following diagram is a simplified high-level representation of how the system works.
While the above looks easy, there is a lot more going on at each step, and those things have to be carefully thought through and then programmed to ensure that the system is performing optimally.
Let me explain step by step:
To generate these recommendations, first, we have to convert the contents of a post into something called -
For generating the vector embeddings, you can use any prominent embedding model such as the OpenAI embedding model, Amazon titan or any open-source text embedding model, depending on your use case. We went with Amazon Titan because of its friendly pricing, performance and operational ease.
Now, this is where things get interesting. You would want to design the queries based on your specific business needs. For example, we give more weightage to the recency of engagement while querying interests than the number of engagements with a specific keyword or user. We also run multiple parallel queries to find different types of interest of the user - keyword or other user. Since we generate multiple feeds for a single user, we also run some queries promoting a specific topic according to the trend (for example, you will see many Christmas-related posts near Christmas or earthquake-related posts if some earthquake has happened). Needless to say, this topic will only come up in the query results if the user has expressed some interest in them in their journey.
So, choose the logic that suits your business use case and the behavior that you want to drive and run multiple queries to get a big enough list of all the user's interests.
Vector databases are predominantly used for performing a particular type of search called
Cache database because one of the problems that we need to solve is speed. We used redis sorted sets for storing the unique IDs of the posts for a specific user. We used redis sorted sets because the order of posts in a user's feed is critical. Also, another problem that you have to solve is that the" system should have logic to ensure that the posts from the same creators aren't grouped on the same page". To avoid repetition of content from the same creator, we have written a simple algorithm which ensures that if a specific creator's post is inserted at any position in a particular user's feed (sorted set), we don't insert another post from the same creator for successive ten positions (we have a page size of 10 while serving the feed to the end user, so we kept it static to avoid complexity).
For deciding the order of a specific recommendation of the user, we factored in the following things -
The strength of the relationship with a specific interest (or another user) for this user: It's determined by an arithmetic formula that takes various data points from the social graph. All of this is engagement data like the timestamp of the last likes created, number of likes created, last comment, etc. User engagement behavior is the indicator of their interest in something.
The popularity of the post on the platform: To determine this, we have created an algorithm that takes various factors such as engagement, engagement-to-impression ratios, number of unique users who engaged, etc., to generate an engagement score of that post at a platform level.
In some feeds, we prioritize popularity; in others, we prioritize the social graph. But mostly, all of them are a healthy mix of the two.
As you can see from the diagram above, the system has been intentionally kept very simple. Following is how the system works -
When user A creates a post, the post service, after saving that post, triggers a pub/sub event to a queue, which is received by a background service meant for candidate generation. We use
This background service receives this asynchronously and performs functionalities discussed earlier - Privacy checks, moderation checks, and keyword generation and then generates the vector embeddings and stores them in the vector database. We are using
Whenever a user engages (like/comment/share, etc.) after updating our main NoSQL database, the post-service triggers a pub/sub event to the recommendation engine service.
This recommendation engine service updates the graph database and then updates the recommended feed of the user in near real-time by performing the ANN search and updating the Redis database. So, the more users interact, the better the feed keeps getting. There are checks to ensure that the recommendations are not biased towards a specific list of keywords. Those checks are performed while we query the Graph database. This service also updates the engagement score asynchronously. Engagement scores are re-calculated on users viewing the post as well.
Since all of the above steps are performed asynchronously behind the scenes, these computations have no impact on the end-user experience.
The feed is finally served to the end user through a feed service. Since this service just performs a lookup on redis and our main NoSQL database (
Some services have been written in
We are using
We use
As you can imagine, this same setup can be tweaked to build a basic recommendation engine for any use case. But, since ours is a social network, we will require some tweaks down the line to make this system more efficient.
Machine learning/ Deep learning algorithms will be needed at the social graph level to predict the keywords and users most relevant for the user. Currently, the data set is too small to predict anything accurately as it is a very new product. However, as the data grows, we will need to replace the current simple queries and formulas with the output of machine learning algorithms.
Relationships between various keywords and users must be fine-tuned and made more granular. They are at a very high level right now. But they will need to be deeper. We will need to explore the second and third-degree relationships in our graph to refine the recommendations first.
We are not doing any fine-tuning in our embedding models right now. We will need to do that in the near future.
I hope you found this blog helpful. If you have any questions, doubts or suggestions, please feel free to contact me on
Also published here.