The science behind personalized music recommendations This Monday — just like every Monday— over 100 million Spotify users found a fresh new playlist waiting for them. It’s a custom mixtape of 30 songs they’ve never listened to before but will probably love. It’s called Discover Weekly_,_ and it’s pretty much magic. I’m a huge fan of Spotify, and particularly Discover Weekly. Why? It makes me feel It knows my musical tastes better than any person in my life ever has, and I am consistently delighted by how it satisfies me every week, with tracks I myself would never have found or known I would like. seen. just right For those of you who live under a musically soundproof rock, let me introduce you to my virtual best friend: A Spotify Discover Weekly playlist — specifically, mine. As it turns out, I’m not alone in my obsession with Discover Weekly—the user base went crazy for it, which has driven Spotify to completely rethink its focus, investing more resources into algorithm-based playlists. body[data-twttr-rendered="true"] {background-color: transparent;}.twitter-tweet {margin: auto !important;} It's scary how well Discover Weekly playlists know me. Like former-lover-who-lived-through-a-near-death experience-with-me well. @Spotify — @dave_horwitz function notifyResize(height) {height = height ? height : document.documentElement.offsetHeight; var resized = false; if (window.donkey && donkey.resize) {donkey.resize(height); resized = true;}if (parent && parent._resizeIframe) {var obj = {iframe: window.frameElement, height: height}; parent._resizeIframe(obj); resized = true;}if (window.location && window.location.hash === "#amp=1" && window.parent && window.parent.postMessage) {window.parent.postMessage({sentinel: "amp", type: "embed-size", height: height}, "*");}if (window.webkit && window.webkit.messageHandlers && window.webkit.messageHandlers.resize) {window.webkit.messageHandlers.resize.postMessage(height); resized = true;}return resized;}twttr.events.bind('rendered', function (event) {notifyResize();}); twttr.events.bind('resize', function (event) {notifyResize();});if (parent && parent._resizeIframe) {var maxWidth = parseInt(window.frameElement.getAttribute("width")); if ( 500 < maxWidth) {window.frameElement.setAttribute("width", "500");}}body[data-twttr-rendered="true"] {background-color: transparent;}.twitter-tweet {margin: auto !important;} At this point 's discover weekly knows me so well that if it proposed I'd say yes @Spotify — @amandawhitbred function notifyResize(height) {height = height ? height : document.documentElement.offsetHeight; var resized = false; if (window.donkey && donkey.resize) {donkey.resize(height); resized = true;}if (parent && parent._resizeIframe) {var obj = {iframe: window.frameElement, height: height}; parent._resizeIframe(obj); resized = true;}if (window.location && window.location.hash === "#amp=1" && window.parent && window.parent.postMessage) {window.parent.postMessage({sentinel: "amp", type: "embed-size", height: height}, "*");}if (window.webkit && window.webkit.messageHandlers && window.webkit.messageHandlers.resize) {window.webkit.messageHandlers.resize.postMessage(height); resized = true;}return resized;}twttr.events.bind('rendered', function (event) {notifyResize();}); twttr.events.bind('resize', function (event) {notifyResize();});if (parent && parent._resizeIframe) {var maxWidth = parseInt(window.frameElement.getAttribute("width")); if ( 500 < maxWidth) {window.frameElement.setAttribute("width", "500");}} Ever since Discover Weekly debuted in 2015, I’ve been dying to know how it worked (plus I’m a fangirl of the company, so sometimes I like to pretend I work there and research their products.) After three weeks of mad googling and a great conversation on Spotify’s roof deck with data engineer , I feel grateful to have finally gotten a glimpse behind the curtain. Nikhil Tibrewal So how does Spotify do such an amazing job of choosing those 30 songs for each person each week? Let’s zoom out for a second to look at how other music services have done music recommendations, and how Spotify’s doing it better. A brief history of online music curation Back in the 2000s, Songza kicked off the online music curation scene using to create playlists for users. “Manual curation” meant that some team of “music experts” or other curators would put together playlists by hand that they thought sounded good, and then listeners would just listen to their playlists. (Later, Beats Music would employ this same strategy.) Manual curation worked okay, but it was manual and simple, and therefore manual curation it couldn’t take into account the nuance of each listener’s individual music taste. Like Songza, Pandora was also one of the original players in the music curation scene. It employed a slightly more advanced approach, instead of songs. This meant a group of people listened to music, chose a bunch of descriptive words for each track, and tagged the tracks with those words. Then, Pandora’s code could simply filter for certain tags to make playlists of similar-sounding music. manually tagging attributes Around that same time, a music intelligence agency from the MIT Media Lab called The Echo Nest was born, which took a radically more advanced approach to personalized music. The Echo Nest used of music, allowing it to perform music identification, personalized recommendation, playlist creation, and analysis. algorithms to analyze the audio and textual content Finally, taking yet another different approach is Last.fm, which still exists today and uses a process called to identify music its users might like_._ More on that in a moment. collaborative filtering So if that’s how music curation services have done recommendations, how does Spotify come up with magic engine, which seem to nail individual users’ tastes so much more accurately than any of the other services? other their Spotify’s 3 Types of Recommendation Models Spotify actually doesn’t use a single revolutionary recommendation model — instead, they mix together some of the best strategies used by other services to create its own uniquely powerful Discovery engine. In 2014, Spotify actually The Echo Nest to gain access to their data and algorithms surrounding audio and text analysis, and they also use collaborative filtering algorithms similar to those used at Last.fm. bought Therefore, to create Discover Weekly, there are three main types of recommendation models that Spotify employs: models (i.e. the ones that Last.fm originally used), which work by analyzing behavior and behavior. Collaborative Filtering your others’ models, which work by analyzing Natural Language Processing (NLP) text. models, which work by analyzing the . Audio raw audio tracks themselves Image credit: Chris Johnson, Spotify Let’s take a dive into how each of these recommendation models work! Recommendation Model #1: Collaborative Filtering First, some background: When many people hear the words “collaborative filtering”, they think of , as they were one of the first companies to use collaborative filtering to power a recommendation model, using users’ star-based movie ratings to inform their understanding of what movies to recommend to _other “_similar” users. Netflix After Netflix used it successfully, its use spread quickly, and now it’s often considered the starting point for anyone trying to make a recommendation model. Unlike Netflix, though, Spotify doesn’t have those stars with which users rate their music. Instead, Spotify’s data is — specifically, the of the tracks we listen to, as well as additional streaming data, including whether a user saved the track to his/her own playlist, or visited the Artist page after listening. implicit feedback stream counts But what collaborative filtering, and how does it work? Here’s a high-level rundown, as encapsulated in a quick conversation: is Image by Erik Bernhardsson What’s going on here? Each of these two guys has some track preferences — the guy on the left likes tracks P, Q, R, and S; the guy on the right likes tracks Q, R, S, and T. Collaborative filtering then uses that data to say, “Hmmm. You both like three of the same tracks — Q, R, and S — so you are probably similar users. Therefore, you’re each likely to enjoy other tracks that the other person has listened to, that you haven’t heard yet.” It therefore suggests that the guy on the right check out track P, and the guy on the left check out track T. Simple, right? But how does Spotify actually use that concept in practice to calculate of users’ suggested tracks based on of other users’ preferences? millions millions …matrix math, done with Python libraries! In actuality, this matrix you see here is . (if you use Spotify, you yourself are a row in this matrix) and in Spotify’s database. gigantic Each row represents one of Spotify’s 140 million users each column represents one of the 30 million songs At the matrix’s intersections, where each user meets each song, there is a 1 if the user has listened to that song, and a 0 if the user hasn’t. So, if I listened to the song “Thriller”, the place where my row meets the column representing “Thriller” is going to be a 1. (Note: Spotify has experimented with using the actual number of streams, vs. a simple 1 vs. 0.) Of course, this makes for a very sparse matrix— there are songs a given user listened to than the ones he/she has, so the majority of the entries in the matrix are just ‘0’. But the placement of those few ‘1’s holds critical information. way more hasn’t Then, the Python library runs this long, complicated matrix factorization formula: Some complicated math… When it finishes, we end up with two types of vectors, represented here by X and Y. , representing one single user’s taste, and , representing one single song’s profile. X is a vector user Y is a vector song The User/Song matrix produces two types of vectors: User vectors and Song vectors. Now we‘ve got 140 million user vectors — one for each user — and 30 million song vectors. The actual content of these vectors is just a bunch of numbers that are essentially meaningless on their own, but they are hugely useful for comparison. To find which users have taste most similar to mine, collaborative filtering compares my vector with all of the other users’ vectors using a Whichever produces the lowest product is the most similar user to me. The same goes for the Y vector, — you can compare a song’s vector with all the other song vectors, and find which songs are most similar to the one you’re looking at. mathematical dot product. songs Collaborative filtering does a pretty good job, but Spotify knew they could do even better by adding another engine. Enter NLP. Recommendation Model #2: Natural Language Processing (NLP) The second type of recommendation model that Spotify employs are . These models’ source data, as the name suggests, are regular ol’ — track metadata, news articles, blogs, and other text around the internet. Natural Language Processing (NLP) models words Natural Language Processing — the ability of a computer to understand human speech as it is spoken — is a whole vast field unto itself, often harnessed through sentiment analysis APIs. The exact mechanisms behind NLP are beyond the scope of this article, but here’s what happens on a very high level: Spotify crawls the web constantly looking for blog posts and other written texts about music, and figures out what people are saying about specific artists and songs — what adjectives and language is frequently used about those songs, and which artists and songs are also discussed alongside them. other The most-used terms bucket up into what Spotify calls “cultural vectors” or “top terms.” Each artist and song has thousands of daily-changing top terms. Each term has a weight associated, which reveals how important the description is (roughly, the probability that someone will describe music as that term.) “Cultural vectors”, or “top terms”. Table from Brian Whitman Then, much like in collaborative filtering, the NLP model uses these terms and weights to create a vector representation of the song that can be used to determine if two pieces of music are similar. Cool, right? Recommendation Model #3: Raw Audio Models First, a question. You might be thinking: But, Sophia, we already have so much data from the first two models! Why do we need to analyze the audio itself, too? Well, first of all, including a third model further improves the accuracy of this amazing recommendation service. But actually, this model serves a secondary purpose, too: Unlike the first two model types, raw audio models take into account songs. new Take, for example, the song your singer-songwriter friend put up on Spotify. Maybe it only has 50 listens, so there are few other listeners to collaboratively filter it against. It also isn’t mentioned anywhere on the internet yet, so NLP models won’t pick up on it. Luckily, raw audio models don’t discriminate between new tracks and popular tracks, so with their help, your friend’s song can end up in a Discover Weekly playlist alongside popular songs! Ok, so now for the “how” — How can we analyze , which seems so abstract? raw audio data …with ! convolutional neural networks Convolutional neural networks are the same technology behind facial recognition. In Spotify’s case, they’ve been modified for use on audio data instead of pixels. Here’s an example of a neural network architecture: Image credit: Sander Dieleman This particular neural network has four , seen as the thick bars on the left, and three dense layers, seen as the more narrow bars on the right. The input are time-frequency representations of audio frames, which are then concatenated to form the spectrogram. convolutional layers The audio frames go through these convolutional layers, and after the last convolutional layer, you can see a “global temporal pooling” layer, which pools across the entire time axis, effectively computing statistics of the learned features across the time of the song. All of this information then arrives at the which : All of these characteristics can be found with pretty high accuracy just from letting these neural networks loose on the audio file. output layer, predicts an understanding of the song’s personality Does it have a high tempo? Is it acoustic? Does it have high danceability? That covers the basics of the three major types of recommendation models feeding the Recommendations pipeline, and ultimately powering the Discover Weekly playlist! Of course, these recommendation models are all connected to Spotify’s much larger ecosystem, which includes giant amounts of data storage and uses of Hadoop clusters to scale recommendations and make these engines work on giant matrices, endless internet music articles, and huge numbers of audio files. lots I hope this was informative and tickled your curiosity like it did mine. For now, I’ll be working my way through my own Discover Weekly, finding my new favorite music, knowing and appreciating all the machine learning that’s going on behind the scenes. 🎶 — — 👏 If you enjoyed this piece, I’d love it if you hit the clap button so others might stumble upon it. You can find my own code on GitHub , and more of my writing and projects at http://www.sophiaciocca.com . 😊 Also, if you work at Spotify or know someone who does, I’d love to connect! I’m putting my dream to work at Spotify out into the world Thanks also to ladycollective for reading this article over and suggesting edits.

Alongside

BUNCH

Discovery

Spotify’s Discover Weekly: How machine learning finds your new music

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

How I landed my post-bootcamp software developer job in just seven weeks

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

How I landed my post-bootcamp software developer job in just seven weeks

The Noonification: Use This 7-Step McKinsey Framework to Solve Any Problem (1/10/2023)

The Noonification: A Taxonomy of Inclusiveness (1/11/2024)

The Noonification: What is the InfiniteNature-Zero AI Model? (11/19/2022)

10 Ways AI Has Changed Our Lives

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps