Skip to main content

Data Quality

Do your product dashboards look funky? Are your quarterly reports stale? Is the data set you're using broken or just plain wrong? Have you ever been about to sign off after a long day running queries or building data pipelines only to get pinged by your head of marketing that “the data is missing” from a critical report? What about a frantic email from your CTO about “duplicate data” in a business intelligence dashboard? Or a memo from your CEO, the same one who is so bullish on data, about a confusing or inaccurate number in his latest board deck? If any of these situations hit home for you, you’re not alone. These problems affect almost every team, yet they're usually addressed on an ad hoc basis and in a reactive manner.

This problem, often referred to as “data downtime,” happens to even the most innovative and data-first companies, and, in our opinion, it’s one of the biggest challenges facing businesses in the 21st century. Data downtime refers to periods of time where data is missing, inaccurate, or otherwise erroneous, and it manifests in stale dashboards, inaccurate reports, and even poor decision making. The root of data downtime? Unreliable data, and lots of it. Data downtime can cost companies upwards of millions of dollars per year, not to mention customer trust. In fact, ZoomInfo found in 2019 that one in five companies lost a customer due to a data quality issue. As you’re likely aware, your company’s bottom line isn’t the only thing that’s suffering from data downtime. Handling data quality issues consumes upwards of 40% of your data team’s time that could otherwise be spent working on more interesting projects or actually innovating for the business.

Many data engineering teams today face the "good pipelines, bad data" problem. It doesn't matter how advanced your data infrastructure is if the data you're piping is bad. We will learn how to tackle data quality and trust at scale by leveraging best practices and technologies used by some of the world's most innovative companies.

  • Build more trustworthy and reliable data pipelines
  • Write scripts to make data checks and identify broken pipelines with data observability
  • Learn how to set and maintain data SLAs, SLIs, and SLOs
  • Develop and lead data quality initiatives at your company
  • Learn how to treat data services and systems with the diligence of production software
  • Automate data lineage graphs across your data ecosystem
  • Build anomaly detectors for your critical data assets