How Spotify Uses ML to Create the Future of Personalization

Machine learning is what drives personalization on Spotify. We may have a single platform with 381 million different users, but it may actually be more accurate to say there are 381 million individual versions of Spotify, each one filled with different Home pages, playlists, and recommendations. But with a library of over 70 million tracks to thumb through, how do our ML models actually go about making these decisions?

Well, Spotify’s VP of Personalization Oskar Stål recently gave a talk at TransformX, a summit for leaders in ML and AI, to discuss just that. Read on to get a glimpse of how ML and reinforcement learning help inform our music and podcast recommendations, and don’t forget to check out Oskar’s presentation here (or below!) to hear even more about the future of ML at Spotify.

How do we use ML?

It all starts with data. At the most fundamental level, all sorts of user info — playlists, listening history, interactions with Spotify’s UI, etc. — are fed into our ML models, while keeping trust and responsibility top of mind. Every day, nearly half a trillion events are processed, and the more info our models gather, the smarter they become about making associations between different artists, songs, podcasts, and playlists.

But our ML models even go beyond this, incorporating other factors in their decision-making processes. What time of day is it? Is this playlist for working out or chilling out? Are you on mobile or desktop? By incorporating several of these ML models throughout Spotify’s infrastructure, we’re able to offer increasingly intelligent, specialized recommendations that can, as Oskar puts it, “serve even the narrowest of tastes”.

We aren’t just looking for our users’ instant gratification, though. We want to provide listeners with a lifetime of great audio experiences and be with them on every step of that journey. And that brings us to what we’re working on now.

The future is reinforcement learning

Reinforcement learning, or RL, is a type of ML model that responds to its current environment in an effort to maximize the ultimate, long-term reward, whatever that may be. In our case, that reward is our users’ long-term satisfaction with Spotify. RL isn’t about short-term solutions. It’s always playing the long game.

In a general sense, our RL model tries to predict how satisfied our users are with their current experience, and attempts to nudge them toward consuming more fulfilling content in their audio diet to make them happier with the service. In other words, rather than handing users the “empty calories” of a content diet that will only satisfy them in the moment, RL aims to push them to a more sustainable, diverse, and fulfilling content diet that will last a lifetime.

This could mean playing a new dance track we think might fit a user’s current mood, or it could mean suggesting a calming, ambient piece to help them study. Predicting what a user will want 10 minutes from now, a day from now, a week from now, means creating a ton of simulations and running the RL model against those simulations to make it smarter, like a computer playing against itself in chess to get better at the game.

With ML and RL, we’re trying to create a more holistic audio experience, focused on recommendations that ensure long-term satisfaction and enjoyment. Our approach to personalization doesn’t just benefit listeners: better and more satisfying recommendations help out artists, exposing their work to a larger audience more likely to enjoy it. After all, there’s a reason there are 16 billion artist discoveries every month on our platform. And the best is yet to come.