A Product Story: The Lessons of Backstage and Spotify’s Autonomous Culture

May 18, 2021 Published by Spotify Engineering

TLDR; In episode 08 of our podcast series “Spotify: A Product Story”, we share stories and lessons from building and open sourcing Backstage, our homegrown developer portal. Hear why a developer-friendly, market-based platform like Backstage could only have been developed at Spotify (where autonomy is prized, not top-down mandates) and why that ends up making Backstage such a flexible fit for other companies, too. Listen to the episode now and get all our hard-earned lessons in entertaining podcast form — or read on for episode highlights and to learn more about this critical time in Spotify’s growth.

How it started: “Like a cold shower”

The story begins five years ago when Spotify had a problem: we were growing fast. Really, really fast. This should be a great problem to have, except that instead of speeding us up, adding new hires was actually slowing us down. 

As Director of Engineering Pia Nilsson explains in the podcast, one of the metrics Spotify’s Platform team used to measure productivity was onboarding time: how long did it take for a new engineer to merge their tenth pull request at Spotify? 

The answer was not good — over 60 days. That is, from the day an engineer walked through Spotify’s doors, it would be two more months before they were able to contribute code in the form of their tenth pull request. 

But the number alone doesn’t capture the whole feeling. Gustav Söderström, Spotify’s Chief R&D Officer and the podcast’s host, asks Pia if she remembers what it was like seeing that “60 days” metric for the first time:

Gustav: Was it like, “Maybe that’s OK”? Or was it like, “That seems super long”?

Pia: Having spent 15 years as an engineer at other companies, it was like a cold shower.

Brrr. So the first thing Pia’s team had to do was figure out what was putting the chill on new hires. Why did productivity keep dropping as the headcount kept rising?

Engineers are users, too

When it comes to their own employees, companies will often skip doing user research — after all, why ask when you can just dictate? 

But the Platform team at Spotify sees Spotify’s developers as their customers. Their priorities are our priorities. Their pain points are our problems to solve. So, to find out what was holding back our engineers, the first thing to do was ask our engineers. 

According to Pia, two issues emerged as common causes for declining productivity:

  1. Context switching: “People are interrupted constantly … New joiners had to tap someone on the shoulder because very seldom was there any documentation.” 
  1. Discoverability: “People couldn’t find things. It was simple as that. It took forever to just find the right service. There were so many almost duplications — not pure duplications — because people are very smart and they would recognize that.” 

There would be 15 different versions of the same service, each speaking to the slightly different needs of different teams. And if a new team needed a similar service? Instead of sorting through all those versions … they would just build yet another version of the same service for themselves. 

In a way, this is what worked for Spotify before: small, autonomous teams building fast. But that basic agile approach was reaching its limits. More teams meant more confusion, as evidenced by our onboarding metric. New hires didn’t even know where to begin — let alone how to decipher our “spaghetti” codebase — without tapping another engineer on the shoulder. It was a way of working that was becoming so common, we gave it a name — “rumour-driven development”.

And as Spotify continued to grow, the problem only got worse.

Speed, scale, autonomy… pick two?

Now that the problem was clear, the solution was also obvious: centralization. But just as obvious was the fact that a centralized team will always be much slower than many small teams. Would Spotify have to trade speed for scale?

Turns out, the question was moot. Tasked with restoring productivity, the Platform team realized that a top-down, centralized approach wouldn’t work at Spotify for another, much more fundamental reason: it just wasn’t part of Spotify’s DNA. As Pia explains in the podcast:

“So we basically knew we couldn’t build a centralized solution. It would never work. No one would use it. And no one really believed in it even among ourselves. We had joined Spotify for the reason that we all loved autonomy. We thought that was brilliant to set people free. So the culture really spoke to us there: “Well, you don’t have the option of building something central and mandating everyone.”

What made Spotify engineering great was now slowing it down: too much autonomy. But that culture of autonomy would also lead to an even better solution than a simplistic tech requirements list or top-down mandates. As Spotify’s VP of Engineering, Tyson Singer, says, for Backstage to succeed with our engineers, it had to be the better solution, not the only solution:

“For the most part, if we go out and we tell people to do X, they just shrug, and they do wherever they want. So we really do have to sell to them. We have to basically make their lives better with everything that we do. And so [our culture] really did inform our approach, if we wanted to take control of this fragmentation problem in our tech ecosystem.”

Spotify wanted something that could give us everything: speed, scale — and a new idea at Spotify — aligned autonomy. And that’s how Backstage was conceived and born.

How it’s going: Not just adopted, but embraced

So if we can’t make anyone use it, how do we know it’s working? Every day, we see the 280 engineering teams inside Spotify use Backstage to manage over 2,000 backend services, 300 websites, 4,000 data pipelines, and 200 mobile features. 

Even more impressive are the contribution numbers. More than 200 engineers inside Spotify have contributed features to Backstage. We now have 120+ plugins developed by 50+ teams. And 80% of contributions came from Spotifiers outside the Backstage core team.

People can find what they need — without constantly interrupting their fellow developers. Any Spotifier — not just engineers, but also compliance and security team members — can easily discover all the software in our ecosystem, see who owns it, and access technical documentation in a centralized location. In an environment optimized for speed and as decentralized as Spotify, having this information so easily accessible makes all the difference. 

For a company growing as fast as ours, this is a game-changing improvement to both productivity and developer happiness — which we believe go hand in hand. And we know the open source version will be able to transform other tech organizations as well. As a product, Backstage is what happens when you treat your developers with the same thoughtfulness as your users. According to our company-wide surveys, 80% of our internal users are satisfied with Backstage.

Want to know what happens next? How much were we able to lower that bone-chilling “60 days to tenth pull request” onboarding metric? How did our homegrown developer portal go on to become Spotify’s biggest open source project? And the significance of this humble GIF?

Listen to episode 08 — “When to build vs buy — and when to open source” — to get the whole story. You’ll hear from Gustav, Tyson, and Pia, as well as Jeremiah Lowin, CEO of Prefect.io, a company that runs on what is called an “open core” model. Now streaming on Spotify — or wherever you listen to podcasts!

Want to hear more about how Spotify was built, straight from the people who built it? The podcast series “Spotify: A Product Story” shares the stories behind the most important product strategy lessons we’ve learned at Spotify, all told in the words of the people who were actually there. 

In each episode, Spotify’s Chief R&D Officer, Gustav Söderström, is joined by Spotify insiders and special guests, from Metallica’s Lars Ulrich and Napster’s Sean Parker, to ML legend Andrew Ng

How did P2P networking and local caching create a feeling of magic in the very first Spotify app? How did we go from stashing servers in a cupboard to running Google Cloud’s largest Dataflow jobs ever? What does it mean to build truly ML-first products? And what’s the next frontier for creators and audio formats? You can find all the podcast episodes here.

Tags: ,