TC4D: Data Quality By Engineers, For Engineers

Changing an engineering culture is one of the biggest challenges for any organization. It requires challenging an existing way of working, and introducing compelling improvements that are adopted by individuals as well as the departments. Tackling tech debt instead of amassing it, using new tooling and infrastructure, or in this case, increasing test coverage. After leading numerous failures to influence the quality of Spotify’s big data, there was finally one success.
TC4D, Test Certified for Data-Focused teams, is a gamified, multi-step program for getting teams to improve their testing practices and test coverage. There is great swag, bragging rights and participating teams “level up” as they go through the process. It shamelessly borrows from Google’s TC initiative.
When we started, Spotify had no standardization for big data quality (BDQ) or any official test tools. The maturity and test coverage for each pipeline was highly dependent upon individual engineers. In this inconsistent landscape pipelines often broke, or worse, bad data propagated upstream, resulting in erroneous metrics or stale recommendations. New engineers had no training on the topic and veteran engineers had habits that nobody was challenging to change. The need was obvious, but how to reach to BDQ Nirvana was a lot less clear.
A couple of things were working for us: Spotify’s senior data engineering community (known as the Data guild in Spotify-ish) was engaged in discussing the topic, and our technical leaders were on board to sponsor the effort. Great! Now what?
We gathered a group of self-selected engineers who were passionate about solving this particular problem. We started with a small group of 4 and myself, an engineering manager. All of us doing the work in addition to our full time roles. Our way of working was simple: stay lean and practice what we preach. Lean because we wanted the result of the program to increase speed of development and decrease incidents. True to Spotify’s agile culture, we didn’t want process for the sake of process. We didn’t include things just because they were “best practices” but because they were relevant to our engineers.
We followed the certification guidelines we came up with, multiple times, before they were finalized. It’s a step that is easy to overlook because it prolongs the process, but if we were not going to take the time to follow the guidelines, why would anyone else?
TC4D consists of three certification levels, each building upon the previous. The first level is focused on setup, monitoring and alerting, the second level on unit tests and security, and the third level on the quality of the data itself. When we came up with the levels, we cared about making them attainable with current infrastructure, and making the first level easy to achieve so teams had an easy win as they got introduced to the program. At first, we tried the levels on two pipelines, and then ran a pilot program with two teams. After each round we revisited the levels and tweaked them. We also made awesome swag because everybody loves good swag and Spotify needs more metal.
TC4D kicked off during an internal conference in 2016. It is driven by a group of mentors who deeply care about BDQ Nirvana and helping others to reach it. We don’t prescribe, we offer advice. We don’t enforce, we encourage discussion and, most importantly, we trust our skilled engineers to make the best decisions for their datasets.
A year later, we have certified more than 40 teams, and usually have one to five certifications in progress each week. TC4D is becoming an official team and plays a critical role in reaching BDQ Nirvana. We are now working in an active landscape, in cooperation with infrastructure teams that are building robust test infrastructure. We had to reorder our swag. Multiple times.
Major Kudos to the TC4D mentors’ team: Gandalf Hernandez, Lavinia Samoilă, Matt Finkel, Kevin Sweeney, Minwei Gu, Rafal Wojdyla, Michael Sanders, Andrew Martin, Nelson Arapé, Riccardo Petrocco, Peter Thai, Nikhil Tibrewal and Joshua Freeberg. And to Thomas Harper for editing this post.




