Episode 58 — 5.4 Assure Data Quality: Tests, Source Control, UAT, Requirement Validation
In Episode Fifty-Eight, titled “Assure Data Quality: Tests, Source Control, U A T, Requirement Validation,” quality assurance is the practical bridge between data work and decisions people are willing to act on. A report can be visually clean and still be useless if the underlying data quietly drifts, definitions shift, or a small transformation error spreads into every downstream metric. Quality assurance is not perfection chasing; it is a disciplined way to make outcomes predictable enough that leaders trust the numbers during budget talks, incident reviews, and performance discussions. When quality is handled as an operating practice instead of a one-time cleanup, the organization gains both speed and credibility.
Tests are the first line of defense because they validate basic expectations about ranges, types, and relationships before humans notice problems in a meeting. Range checks catch impossibilities like negative counts, timestamps far in the future, or percentages that exceed expected bounds, which often signal corruption or parsing failures. Type checks confirm that dates still behave like dates and numbers still behave like numbers, which matters because silent type shifts can turn comparisons and calculations into nonsense without raising errors. Relationship checks look for integrity signals, such as whether identifiers remain unique where they should, whether required links between tables still exist, and whether category values stay within an approved set. When these tests run routinely, they keep small data issues from becoming large decision errors.
Source control is the practice that keeps logic changes visible, attributable, and reversible, which is essential when code and transformations evolve continuously. Without source control, teams struggle to answer when a metric changed, what exactly changed in the logic, and whether the change was intentional or accidental. With source control, each change can be linked to an author, a purpose, and a specific revision, which makes troubleshooting faster because the search space shrinks to a known set of edits. Source control also supports healthy collaboration, because it encourages small, reviewable changes rather than large, opaque rewrites that are hard to validate. Over time, this visibility becomes a form of operational memory, protecting the organization when staff changes or when an audit asks for evidence of controlled change.
User acceptance testing, U A T, is where quality is evaluated through the lens that matters most: whether the output matches how users interpret and rely on it. A dataset can pass technical checks and still fail its purpose if the report layout, labels, or rollups do not align with how stakeholders expect to make decisions. U A T surfaces gaps like missing context, confusing filters, mismatched time windows, or definitions that are technically consistent but practically misleading for the audience. It is also where edge cases show up, because users often probe the boundaries, such as comparing periods, slicing unusual segments, or expecting drill paths to behave in a certain way. When U A T is treated as a normal gate rather than an afterthought, it prevents the common cycle of shipping first and explaining later.
Requirement validation sits underneath U A T and tests, because quality begins with confirming that the dataset answers the right question. Many reporting failures are not errors in computation, but errors in interpretation, where the system produces a precise answer to a question nobody meant to ask. Requirement validation clarifies scope, timeframe, population, and exclusions, so a metric like “monthly revenue” is not quietly blending incompatible definitions such as booked versus recognized or gross versus net. It also clarifies decision intent, such as whether the metric supports forecasting, compliance reporting, operational response, or executive narrative, since each purpose changes what “good” looks like. When requirements are explicit, the quality work has a target that can be measured rather than a vague hope that the output feels right.
Quality checks become more reliable when they exist at ingestion, transformation, and reporting stages, because failures can occur at any step and each step has its own signature. Ingestion checks focus on freshness, completeness, schema consistency, and obvious anomalies that signal broken inputs, because it is cheaper to catch issues before they spread. Transformation checks focus on logic correctness, join behavior, deduplication integrity, and aggregation validity, because this is where a small rule change can reshape totals across the entire pipeline. Reporting-stage checks focus on reconciliations, dashboard behavior, filter integrity, and version labeling, because this is where users experience the system and where trust is won or lost. When checks are layered across stages, quality becomes resilient, since one missed signal is less likely to become a full failure.
A finance metric scenario anchors these ideas because finance numbers are both high-stakes and highly sensitive to definition drift. Consider a monthly margin metric that leaders use to decide staffing, pricing, and risk posture, where a small change in cost categorization or revenue timing can shift conclusions significantly. In such a scenario, ingestion checks can confirm that the latest transactions and adjustments arrived on schedule, transformation checks can confirm that joins and allocations behave as intended, and reporting checks can reconcile the output to a trusted close figure. U A T can then confirm that stakeholders interpret the metric correctly, including which period it represents and what adjustments are included. When finance metrics are handled this way, the organization avoids the painful situation where leaders debate which number is “real” in the middle of a decision.
Regression prevention is the discipline of rerunning tests after every change so quality does not depend on someone remembering what might be affected. Regressions are common because pipelines and reports are interconnected, and a change that seems local can alter behavior elsewhere through shared tables, reused definitions, or downstream calculations. When tests run consistently, teams can see whether a change caused unexpected shifts in row counts, category distributions, or key totals, and they can catch problems before users do. This practice also supports faster iteration because it reduces fear, since the system has guardrails that detect damage early. Over time, regression prevention turns quality from a periodic fire drill into a steady rhythm that makes reporting safer to evolve.
Defect tracking keeps quality work accountable by turning “something looks off” into a managed record with an owner and a timeline. A well-formed defect description captures the symptom, the scope of impact, the expected behavior, and the evidence that confirms the problem, which prevents the issue from being treated as a vague complaint. Ownership matters because quality issues die in ambiguity when no one is responsible for diagnosis and resolution across boundaries. Timelines matter because some defects block decisions immediately while others can be scheduled, and the organization needs a clear way to prioritize without ignoring problems. When defects are tracked consistently, the quality program becomes measurable, and stakeholders gain confidence that issues are handled deliberately rather than informally.
Separating data issues from logic issues is a critical diagnostic move because the fix path differs, and mixing them wastes time. A data issue often involves missing records, late arrivals, schema drift, or unexpected values that violate assumptions, while a logic issue involves incorrect joins, filter mistakes, aggregation errors, or misapplied business rules. Controlled samples help isolate the cause because they allow the same logic to be applied to a small, known set of inputs, making it easier to see whether the computation behaves correctly under stable conditions. If the logic is correct on controlled inputs but fails on production, attention shifts to data integrity and upstream behavior, and if it fails on controlled inputs, the logic itself needs revision. This separation makes quality work calmer because it replaces speculation with structured elimination.
Acceptance criteria make quality measurable, which is important because “good data” is otherwise an argument about taste. Criteria can define allowed tolerance for reconciliation differences, acceptable freshness windows, expected completeness thresholds, and required behaviors for filters and drill paths. For metrics, acceptance criteria also capture definition expectations, such as whether the metric includes a particular segment, how nulls are treated, and what time zone interpretation applies to period boundaries. When criteria are written plainly, U A T becomes sharper because users can confirm outcomes against agreed expectations rather than against personal intuition. This clarity also improves peer review, since reviewers can evaluate changes against criteria rather than guessing what “correct” means.
Monitoring quality trends matters because a one-time pass can hide slow drift that becomes damaging weeks later. Quality signals such as row counts, missing value rates, category distribution shifts, and reconciliation deltas can reveal creeping issues like source degradation, partial ingestion, or subtle schema changes. Trend monitoring also reveals whether the system is becoming more stable over time or whether fixes are temporary and problems keep reappearing. The goal is to spot patterns early, like increasing late-arriving data, rising null rates in a key field, or widening divergence from a trusted total, so corrective action can be planned before trust erodes. When monitoring is consistent, quality becomes something the organization can manage proactively rather than reactively.
A simple quality pipeline can be explained as a flow from clear requirements, to controlled changes, to repeated checks, to user confirmation, and then to ongoing monitoring. Requirements validation ensures the system is answering the right question, source control ensures changes are traceable and reviewable, and tests ensure basic integrity holds at each stage. U A T ensures the output matches how stakeholders interpret and use it, while defect tracking ensures problems become actionable work rather than informal frustration. Regression checks and quality trend monitoring then keep the system stable as it evolves, preventing slow drift from becoming a sudden crisis. When this pipeline is understood as one connected system, quality assurance stops feeling like a separate activity and starts feeling like part of responsible delivery.
A practical summary is that quality is built by repeating a small set of habits rather than by inventing new checks every time something breaks. The habits begin with stating the question clearly, then protecting logic with traceable change history, and then validating the result with tests that catch obvious integrity failures. User confirmation through U A T provides the reality check that technical correctness still serves decision needs, and defect tracking ensures gaps are closed with accountability. Monitoring then keeps the system honest over time, because data environments change even when reports do not. When these habits are repeated consistently, quality becomes something the organization can rely on without heroic effort.
To conclude, one useful step is choosing a single test to add this week that would have prevented a real past issue, such as a freshness check, a row count drift check, a null surge alert, or a reconciliation threshold against a trusted total. The best candidate is a test that is simple, cheap to run, and tied to a decision-critical metric so it protects trust where it matters most. Once that test exists, it becomes part of the organization’s memory, catching the same class of failure before it reaches users again. Over time, a steady cadence of adding small, high-value tests creates a reporting environment that feels predictable, defensible, and safe to build on.