paint-brush
Code Your Own Popularity Based Recommendation System WITHOUT a Library in Pythonby@vyashemang
13,907 reads
13,907 reads

Code Your Own Popularity Based Recommendation System WITHOUT a Library in Python

by Hemang VyasAugust 29th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Recommendation systems are everywhere right now like Amazon, Netflix, and Airbnb. So, probably that would make you wonder that how these engines work, so in this article I will try to explain the Popularity based recommendation system.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Code Your Own Popularity Based Recommendation System WITHOUT a Library in Python
Hemang Vyas HackerNoon profile picture

Recommendation systems are everywhere right now like Amazon, Netflix, and Airbnb. So, probably that would make you wonder that how these engines work, so in this article I will try to explain the Popularity based recommendation system.

Types of recommendation systems are as follows:

  1. Popularity based recommendation system
  2. Collaborative recommendation system
  3. Content-based recommendation system
  4. Demographic-based recommendation system
  5. Utility-based recommendation system
  6. Knowledge-based recommendation system
  7. Hybrid-recommendation system

Popularity based recommendation system

As the name suggests Popularity based recommendation system works with the trend. It basically uses the items which are in trend right now. For example, if any product which is usually bought by every new user then there are chances that it may suggest that item to the user who just signed up.

There are some problems as well with the popularity based recommender system and it also solves some of the problems with it as well.

The problems with popularity based recommendation system is that the personalization is not available with this method i.e. even though you know the behaviour of the user you cannot recommend items accordingly.

So, I hope you now have enough idea about the popularity based recommendation system. So, let’s get our hand dirty with the code. The link to my notebook and data is here.

Let’s begin the coding part



import pandasimport numpy as npimport Recommender

First of all we will import the numpy and the pandas which we are going to use a lot and the class that we created which includes the methods like create which basically creates the recommendations and the recommend which recommends the items to the user.

triplets_file = 'https://static.turi.com/datasets/millionsong/10000.txt'

songs_metadata_file = 'song_data.csv'

In this section we import our dataset as a triplets_file and songs_metadata_file. The triplet_file contains user_id, song_id and listen_count. The songs_metadata_file contains song_id, title, release_by and artist_name.


song_df_1 = pandas.read_table(triplets_file,header=None)song_df_1.columns = ['user_id', 'song_id', 'listen_count']

After that we have to merge the two datasets that we imported.


song_df_2 = pandas.read_csv(songs_metadata_file)song_df = pandas.merge(song_df_1, song_df_2.drop_duplicates(['song_id']), on="song_id", how="left")

In this section we merged the two by droping the dupliactes song_id.

song_df = song_df.head(10000)

Because it is a large dataset I have only considered first 10k rows.

After changing the length of dataset I have done some additional modification.

song_df['song'] = song_df['title'].map(str) + " - " + song_df['artist_name']

In which I included the one column named song and which concatenates the title and the artist of the song.


song_df_grouped = song_df.groupby(['song']).agg({'listen_count': 'count'}).reset_index()song_df_grouped.sort_values('listen_count',ascending = 0)

In the step shown above the dataset grouped using the pandas function groupby by song field and aggregated with the listen_count field and after that the values are sorted in non increasing order according to the listen_count.


users = song_df['user_id'].unique()len(users)


items = song_df['song'].unique()len(items)

This step finds out the unique users and items in the dataset.


from sklearn.cross_validation import train_test_splittrain_data, test_data = train_test_split(song_df, test_size = 0.20, random_state=0)

This section of code splits the dataset into training and the test dataset using 80–20 ratio.


pr = Recommender.Popularity_Recommender()pr.create(train_data, 'user_id', 'song')

Here, the instance is created of the class Popularity_Recommender(). The function create takes the three parameters the training data, user id for which the recommendation is created and the column of the item for which you want to make recommendation in our case it is song.

pr.recommend(users[5])

As I said before that the method recommend gives the recommendation to the user which is passed as a parameter. So, it returns the list of the popular songs for the user but since it is popularity based recommendation system the recommendation for the users will not be affected.

Result of the recommendation system for 6th user

pr.recommend(users[100])

So, as you can see here that although if we change the user the result that we get from the system is the same since it is a popularity based recommendation system.

Result of the recommendation system for 99th user

Don’t forget to clap and put down your thoughts about the article. Thank you :)

Source:

  1. Divya Sardana | Building Recommender Systems Using Python
  2. https://github.com/llSourcell/recommender_live