Listening Together, Miles Apart

Every second more than 30,000 people around the world press play on the same song on Spotify. Imagine if you could somehow see these musical connections as they were happening and connecting us with people we’ve never met, miles apart.

Listening Together, a campaign we launched in April, has made that possibility a reality. Since then, the site has been visited over a million times, and users have been able to see what song is played at the exact same time, in two different places.

Behind the beats

At Spotify, we use an asynchronous messaging service, Google Cloud Pub/Sub, that enables our teams to tap into events generated by our different systems. Every time a user plays a song on Spotify, a message is published. Assuming you have the necessary permissions and authorized access, you can tap into that stream of events for real-time processing.

To process such a large number of events, we needed a scalable stream processing framework, in this case the open source Apache Beam project using Google’s Dataflow processing backend.

Preferring a functional programming approach to data processing, we developed Scio, a Scala API on top of Apache Beam that we open sourced, and is now the default batch and stream data processing framework at Spotify.

Scio and Dataflow give us the scalability, along with a simple and succinct programming model, required to process the incoming plays.

Mapping the music

For Listening Together, we monitor all the incoming plays by listening to the Pub/Sub events mentioned above. The plays come in one by one and we use Dataflow’s windowing features to collate all the plays that come in over a short period of time into one group. We consider all the plays within that period to have happened within the same time frame.

With this set of plays, we can extract the identifier of the track that was played together with the IP address from where it came. We group by the track, giving us a dataset of unique tracks together with a list ofIP addresses associated with the users that played the given track.

We filter out plays from individual users, as well as IPs that come from VPNs and other sources used to anonymize connections, since their locations don’t correspond to the listener’s physical location. We also don’t need 30,000 plays to create the visual experience, so we take a sample from the plays and process a fraction of them.

The next step is to convert the IP addresses, using databases from MaxMind, to cities and countries. MaxMind also provides approximate locations, which we can use for projecting locations on the Listening Together website. The pipeline also handles localization of city names. Our streaming pipeline ends with the Listening Together sessions being put into a new Pub/Sub topic.

To make the experience as global as possible, we developed a backend service, using our Apollo framework, that listens to the Pub/Sub messages and pulls down new Listening Together sessions, as they become available. This allows site visitors to see the same play at the same time.

Our service is autoscaled in our Kubernetes clusters in geographically separate data centers, leading to slight differences in timing (i.e. plays). As a result, from our goal of serving up one unique play to visitors globally has deviated just a touch, leading to a handful of different plays shown each second. Global synchronization can be a challenge.

We expose an endpoint that the website calls to fetch the latest sessions. Since all calculations have already been done, the endpoint only has to return data already in memory, giving us the freedom to serve a large number of requests with minimal resources.

It typically takes just a handful of seconds from the time a listener presses play to it appearing on the site. The website gives you a great view of what people are listening to in real time, sending the visitor on a virtual tour as the animated globe spins from city to city where the same track is being played.

Take a tour

Explore Listening Together for yourself and see how people all over the planet are connected through their shared love of music. Perhaps you’ll find a new favorite song, inspired by the unifying power of music, and maybe, just maybe, you’ll feel a bit more connected with someone in another part of our world.

If you are interested in joining us and helping to connect people through music, we are hiring!

Tags: Data