14,286 reads

Code Your Own Popularity Based Recommendation System WITHOUT a Library in Python

by Hemang VyasAugust 29th, 2018

Too Long; Didn't Read

Recommendation systems are everywhere right now like Amazon, Netflix, and Airbnb. So, probably that would make you wonder that how these engines work, so in this article I will try to explain the Popularity based recommendation system.

Companies Mentioned

featured image - Code Your Own Popularity Based Recommendation System WITHOUT a Library in Python

Types of recommendation systems are as follows:

Popularity based recommendation system
Collaborative recommendation system
Content-based recommendation system
Demographic-based recommendation system
Utility-based recommendation system
Knowledge-based recommendation system
Hybrid-recommendation system

Popularity based recommendation system

As the name suggests Popularity based recommendation system works with the trend. It basically uses the items which are in trend right now. For example, if any product which is usually bought by every new user then there are chances that it may suggest that item to the user who just signed up.

There are some problems as well with the popularity based recommender system and it also solves some of the problems with it as well.

The problems with popularity based recommendation system is that the personalization is not available with this method i.e. even though you know the behaviour of the user you cannot recommend items accordingly.

So, I hope you now have enough idea about the popularity based recommendation system. So, let’s get our hand dirty with the code. The link to my notebook and data is here.

Let’s begin the coding part

import pandasimport numpy as npimport Recommender

First of all we will import the numpy and the pandas which we are going to use a lot and the class that we created which includes the methods like create which basically creates the recommendations and the recommend which recommends the items to the user.

triplets_file = 'https://static.turi.com/datasets/millionsong/10000.txt'

songs_metadata_file = 'song_data.csv'

In this section we import our dataset as a triplets_file and songs_metadata_file. The triplet_file contains user_id, song_id and listen_count. The songs_metadata_file contains song_id, title, release_by and artist_name.

song_df_1 = pandas.read_table(triplets_file,header=None)song_df_1.columns = ['user_id', 'song_id', 'listen_count']

After that we have to merge the two datasets that we imported.

song_df_2 = pandas.read_csv(songs_metadata_file)song_df = pandas.merge(song_df_1, song_df_2.drop_duplicates(['song_id']), on="song_id", how="left")

In this section we merged the two by droping the dupliactes song_id.

song_df = song_df.head(10000)

Because it is a large dataset I have only considered first 10k rows.

After changing the length of dataset I have done some additional modification.

song_df['song'] = song_df['title'].map(str) + " - " + song_df['artist_name']

In which I included the one column named song and which concatenates the title and the artist of the song.

song_df_grouped = song_df.groupby(['song']).agg({'listen_count': 'count'}).reset_index()song_df_grouped.sort_values('listen_count',ascending = 0)

In the step shown above the dataset grouped using the pandas function groupby by song field and aggregated with the listen_count field and after that the values are sorted in non increasing order according to the listen_count.

users = song_df['user_id'].unique()len(users)

items = song_df['song'].unique()len(items)

This step finds out the unique users and items in the dataset.

from sklearn.cross_validation import train_test_splittrain_data, test_data = train_test_split(song_df, test_size = 0.20, random_state=0)

This section of code splits the dataset into training and the test dataset using 80–20 ratio.

pr = Recommender.Popularity_Recommender()pr.create(train_data, 'user_id', 'song')

Here, the instance is created of the class Popularity_Recommender(). The function create takes the three parameters the training data, user id for which the recommendation is created and the column of the item for which you want to make recommendation in our case it is song.

pr.recommend(users[5])

As I said before that the method recommend gives the recommendation to the user which is passed as a parameter. So, it returns the list of the popular songs for the user but since it is popularity based recommendation system the recommendation for the users will not be affected.

Result of the recommendation system for 6th user

pr.recommend(users[100])

So, as you can see here that although if we change the user the result that we get from the system is the same since it is a popularity based recommendation system.

Result of the recommendation system for 99th user

Don’t forget to clap and put down your thoughts about the article. Thank you :)

Source: