Big Data Processing at Spotify: The Road to Scio (Part 1)
This is the first part of a 2 part blog series. In this series we will talk about Scio, a Scala API for Apache Beam and Google Cloud Dataflow, and [...]
Published by Neville LiThis is the first part of a 2 part blog series. In this series we will talk about Scio, a Scala API for Apache Beam and Google Cloud Dataflow, and [...]
Published by Neville LiSecuring our Cloud infrastructure is incredibly important. We are now taking another step forward by leveraging open source tools we developed in [...]
Published by Gianluca BrindisiSpotify began using Docker with a few prototype services in 2014. We upgraded and configured it many times since and have almost every time come across issues that were often hard to detect and fix.
Published by David XiaWe are happy to announce a new open sourced project from Spotify called Spydra. Spydra makes it easy to run data [...]
Published by Hannu VarjorantaEvery day, Spotify users are generating more than 100 billion events. Every event is being generated as a response to [...]
Published by Igor MaravićForward: This blog post accompanies our presentation given at SRECon 2017 in San Francisco. The recording of the talk can be viewed here, [...]
Published by Lynn RootFive years ago, music personalization at Spotify was a tiny team. The team read papers, developed models, wrote data pipelines [...]
Published by Spotify EngineeringIntroduction When you log into Spotify, browse through your Discover Weekly playlist, and play a track, you’re interacting with some of our [...]
Published by Nic CopeWhenever a user performs an action in the Spotify client—such as listening to a song or searching for an artist—a [...]
Published by Igor MaravićWhenever a user performs an action in the Spotify client—such as listening to a song or searching for an artist—a [...]
Published by Igor MaravićWhenever a user performs an action in the Spotify client—such as listening to a song or searching for an artist—a [...]
Published by Igor MaravićIntroduction In the previous post we talked about how the Internet finds its way to reach content and users; how Internet relations [...]
Published by dbarrosopIntroduction This is the first part of a series of posts about a project we have been working with for [...]
Published by dbarrosopWhat to Measure? In part 1, we already mentioned a few metrics that should be considered by the load balancer. Success [...]
Published by Lukáš PoláčekLoad Balancing Most Spotify clients connect to our back-end via accesspoint which forwards client requests to other servers. In the picture below, the accesspoint has [...]
Published by Lukáš PoláčekThis is the second part in a series about Monitoring at Spotify. In the previous post I discussed our history of operational [...]
Published by John-John TedroThis is the first in a two-part series about Monitoring at Spotify. In this, I’ll be discussing our history, the [...]
Published by John-John TedroStory Lots of the UI of our iOS application is rendered through an internal framework called Ceramic. It’s a tool [...]
Published by Hector ZarateIllustrations by Jonas Ekman Since the dawn of time, man has used 32-bit addressing. When the first Homo Erectus crawled [...]
Published by Per KnyttIn this blog post we focus on the web load balancers and various proxy systems across the Spotify perimeter. We [...]
Published by Alexey LapitskyIn my spare time, I have developed a C++ framework for property based testing called RapidCheck. As my Hack Week project, I wanted [...]
Published by shadewindIntroduction All Spotify users are now stored in a Cassandra database instead of Postgres. The final switch was made on [...]
Published by Marcus VesterlundSometimes the answer to a sluggish data pipeline isn’t more power in the Hadoop cluster, but a shift in technique. [...]
Published by Noel CodyFor my master’s thesis, I developed and benchmarked an Apache Cassandra compaction strategy optimized for time series. The result, the [...]
Published by Björn Hegerfors