Our Data Quality Journey: Striving Toward “It Just Works”

Insights

Topic

Words by:

In the world of restaurant operations, time is short, pressure is high, and decisions need to be data-driven. At Tenzo, we believe the data behind those decisions should be effortless, something you never have to second-guess. Our ultimate vision is simple: data that’s 100% accurate, 100% of the time.

We’ll be the first to admit, we’re not quite there yet. Our data is correct 99% of the time, but that extra 1% is what we’re continually working toward.

This post shares our technical journey to improving data quality at scale, and how we’re systematically moving from reactive to proactive data integrity across our platform.

Data You Can Trust: The Mission

Tenzo integrates with dozens of data sources: POS systems, labour schedulers, inventory tools, review platforms, and more. Our ETL (extract, transform, load) pipeline processes millions of rows of data each day, standardising them into a universal schema that’s queryable, comparable, and actionable.

The goal? When a user logs into Tenzo and sees today’s sales, they shouldn’t even think about whether that number is right. It just is.

But in reality, data pipelines aren’t flawless. There are two primary places where things can go wrong:

At the source: APIs and data exports sometimes give us incomplete, incorrect, or malformed data.
In transformation: Bugs in our own logic or inconsistencies in data formats across integrations can introduce mismatches.

Historically, this made debugging tricky. If a customer flagged a mismatch – say, their POS showed £5,321.09 in sales but Tenzo displayed £5,197.88 – we had to investigate each stage of the pipeline manually to figure out where the discrepancy occurred.

This reactive model wasn’t scalable. We needed a more systematic approach.

Engineering for Proactive Data Quality

To solve this, we initiated a dedicated data quality project led by our data engineer. The objective was clear: embed proactive data quality checks directly into our data pipeline so we could catch issues before they became customer-facing bugs.

This meant adding structure and automation around two key layers of validation:

1. Module-Level Checks

These are generic checks applied across all data modules. For example:

Sales columns should contain valid numerical amounts.
Ticket IDs should be unique.
Timestamps should follow expected formats.

Because many of our integrations (e.g., Square, Toast, Revel) all expose similar core data tables – like sales, tickets, or products – we can define broad validation rules that apply across the board.

2. Integration-Level Checks

Every integration has its quirks. Square might record a ticket in one structure, while Toast uses another. Integration-level checks are bespoke tests that compare key metrics (e.g., total sales, order counts, staff hours) before and after transformation, ensuring the numbers align with source systems, even if the underlying schema differs.

Together, these two types of checks form the foundation of our data quality strategy.

Tooling: Why We Chose Soda

Implementing quality checks efficiently required selecting the right tool. We evaluated several options, including Great Expectations (GX) and dbt tests, but ultimately selected Soda, an open-source framework designed for data quality monitoring.

Here’s why Soda worked for us:

Our pipeline runs primarily in Python and uses Pandas DataFrames extensively.
Soda’s integration with Pandas allows in-memory validation, meaning we can perform checks while the data is already being transformed.
It supports both programmatic and declarative definitions of data tests, giving our engineers flexibility.

This decision allowed us to inject validation logic right at the transformation layer, reducing latency and enabling real-time alerting on anomalies.

What We’re Seeing So Far

We’ve already piloted the new checks on a subset of integrations, starting with Square. The results have been promising:

Customer-reported data bugs decreased significantly in the first month of rollout.
We detected mismatches early in the pipeline, including issues from source APIs that previously went unnoticed.
Our dev team spent less time firefighting and more time building new features.

To be fully transparent: we’re expecting a short-term increase in reported issues. That’s actually a good sign; it means our system is now surfacing bugs we previously had to rely on users to find.

From Reactive to Proactive

The long-term vision is a complete inversion of our old model. Instead of relying on customers to flag data mismatches, our goal is to maintain a real-time health dashboard of all data integrations. This will let us:

Identify broken or misbehaving APIs instantly.
Monitor integration-level metrics daily for anomalies.
Build a database of known issues per integration, allowing us to develop permanent fixes, not patches.

Our north star metric? Reduce time spent on data bug triage from 50% of engineering time to less than 5%.

We’re not there yet, but this is the foundation.

Why This Matters to You

If you’re part of a multi-location restaurant group using Tenzo, this work might be invisible. That’s exactly the point. The less time you spend questioning your data, the more time you can spend running your business.

But if you’re a developer or technical stakeholder, we hope this gives you confidence that Tenzo is taking data quality seriously, engineering for scale, predictability, and transparency. We want to set the bar for hospitality data tooling.

The Road Ahead

Data quality is not a destination; it’s a process. We’ll continue refining our pipeline, expanding checks to new integrations, and rolling out customer-facing insights so you can trust your numbers and focus on action.

If you’re a partner or developer interested in collaborating, whether it’s improving an integration, sharing schema changes, or co-developing validation rules, we’d love to hear from you.

At Tenzo, our mission is simple: make restaurant data effortless.

We’re not just fixing bugs. We’re building a future where “it just works.”

Single sites & small chains

Enterprise & large chains

Hotels